weitianxin / UniMP

[ICLR 2024] Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond
14 stars 2 forks source link

A question about content generation. #1

Open VanillaCreamer opened 5 months ago

VanillaCreamer commented 5 months ago

To authors state that

To make transformers generate images, please refer to VQGAN and its scripts.

Do you mean that I should construct a pipeline by myself for content generation (RQ4)?

weitianxin commented 5 months ago

Thanks for the question. Our content generation task is distinct and separate from other tasks. Although we use the same data for content generation as for other tasks (i.e., user history, product images and associated attributes), the processing is different. After processing the content with VQGAN, our model/transformer can generate content based on that.