weitianxin / UniMP

[ICLR 2024] Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond
9 stars 0 forks source link

A question about content generation. #1

Open manjusaka-L opened 3 months ago

manjusaka-L commented 3 months ago

To authors state that

To make transformers generate images, please refer to VQGAN and its scripts.

Do you mean that I should construct a pipeline by myself for content generation (RQ4)?

weitianxin commented 2 months ago

Thanks for the question. Our content generation task is distinct and separate from other tasks. Although we use the same data for content generation as for other tasks (i.e., user history, product images and associated attributes), the processing is different. After processing the content with VQGAN, our model/transformer can generate content based on that.