lijiannuist / Efficient-Multimodal-LLMs-Survey

Efficient Multimodal Large Language Models: A Survey
Apache License 2.0
187 stars 5 forks source link

Wonderful Survey! Add of a new work. #3

Closed gordonhu608 closed 1 month ago

gordonhu608 commented 1 month ago

Thanks for the wonderful survey. We would like to add a new work: Matryoshka Query Transformer for Large Vision-Language Models. Paper: https://arxiv.org/abs/2405.19315 code: https://github.com/gordonhu608/MQT-LLaVA. This model adds a new perspective in efficiently utilize visual tokens for Multimodal LLMs. Thank you so much ahead for considering our work.

lijiannuist commented 1 month ago

Great work, We have added it to this repo and will include the paper in our next version.