apple / ml-aim

This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
Other
1.05k stars 50 forks source link

native resolution support on larger size #17

Open MonolithFoundation opened 6 days ago

MonolithFoundation commented 6 days ago

Hi, would consider opensource larger native resolution model especially the 1B one?

DonkeyShot21 commented 5 days ago

Thank you for your interest! At the moment we do not have plans to release larger native resolution models. However, we appreciate your feedback and will keep this in mind.

lucasjinreal commented 4 days ago

Hello, I am likewise extremely interested in large native models (even those of a size as extensive as 1B).

Currently, vision encoders (VEs) of fixed dimensions are not particularly efficacious when it comes to comprehending large resolutions, such as in document understanding and other related domains. (we can only employ interpolation which significantly sacrifices accuracy).

I hope that your team contemplates open-sourcing large native models to confer more advantages on the community.

aelnouby commented 1 day ago

Thanks for your feedback, @MonolithFoundation and @lucasjinreal! We will consider adding native resolution support for the higher capacity models in our short/mid-term plans and will keep you updated.

MonolithFoundation commented 1 day ago

Thank u so much for the consideration!