What the size of the input image should be of CuMo?

SHI-Labs / CuMo

CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts

Apache License 2.0

117 stars 8 forks source link

Closed leoozy closed 1 month ago

chrisjuniorli commented 1 month ago

the input image can be any size and the data loader will resize it to 336x336 and then send it to CLIP.

leoozy commented 1 month ago

Thanks.

leoozy commented 1 month ago

Hello， I noticed that in paper you implement the multi-resolution input. Does this implement have this feature? Thanks!

chrisjuniorli commented 1 month ago

leoozy commented 1 month ago

Thanks!