kohjingyu / gill

🐟 Code and models for the NeurIPS 2023 paper "Generating Images with Multimodal Language Models".
https://jykoh.com/gill
Apache License 2.0
433 stars 38 forks source link

why don't you use universal representation in one task? #34

Open hsjkdjj opened 10 months ago

hsjkdjj commented 10 months ago

I am curious why don't you use universal representation in one task? like input: [image]+ caption output: caption +[IMG1]...[IMGn]