shizhediao / DaVinci

Source code for the paper "Prefix Language Models are Unified Modal Learners"
BSD 3-Clause "New" or "Revised" License
43 stars 3 forks source link

how to use ORD datasets into the model? #7

Open ALR-alr opened 1 week ago

ALR-alr commented 1 week ago

ORD datasets have two parts: scene-graph and region-description,. How can you use it to complete a I2T or T2I task? Do you rewrite the region descriptions to a whole sentence as a text input?

shizhediao commented 1 week ago

Yes, we did some data augmentation based on the original data. You can find the details in Appendix A.3 https://arxiv.org/pdf/2206.07699

ALR-alr commented 1 week ago

Thanks for your rapid reply! Whether the prompt template only consider the objects, but ignore the attributions and relations?

shizhediao commented 1 week ago

Yes, but you could explore it if you think it is beneficial. Thanks!

ALR-alr commented 1 week ago

Thank you very much, your reply is crucial to me!