hello.
I am very interested in your research, especially in the latest Any-to-Any model, CoDi-2.
My main question is about the whereabouts of the in-context multimodal instruction dataset you built for CoDi-2 training.
Do you have any plans to make this data available to the public?
Additionally, in the paper there seems to be a wide variety of task types, were you able to quantitatively calculate performance on these tasks?
I ask because only the image editing and audio editing tasks were quantitatively evaluated for performance.
I have the same question. CoDi-2 is an amazing work, and the release of datasets and checkpoints will increase its impact and contribute more to the community without a doubt.
hello. I am very interested in your research, especially in the latest Any-to-Any model, CoDi-2.
My main question is about the whereabouts of the in-context multimodal instruction dataset you built for CoDi-2 training. Do you have any plans to make this data available to the public?
Additionally, in the paper there seems to be a wide variety of task types, were you able to quantitatively calculate performance on these tasks? I ask because only the image editing and audio editing tasks were quantitatively evaluated for performance.
Thank you!