My team has released a new benchmark for the ability for "omni understanding" the image, audio and text of the MLLMs. We are quite confident that this could be a useful resource for the recent research in the field, and sincerely hope that we could share our work in this repo, thanks!
Hi,
Nice work on indexing useful papers in this repo!
My team has released a new benchmark for the ability for "omni understanding" the image, audio and text of the MLLMs. We are quite confident that this could be a useful resource for the recent research in the field, and sincerely hope that we could share our work in this repo, thanks!
links:
Best