matatonic / openedai-vision

An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.
GNU Affero General Public License v3.0
203 stars 16 forks source link

Model request: Ovis1.6-Gemma2-9B-bnb-4bit #22

Open Jonseed opened 1 month ago

Jonseed commented 1 month ago

It would be great to add this 4-bit quantized version of Ovis 1.6, to run on lower memory: https://huggingface.co/ThetaCursed/Ovis1.6-Gemma2-9B-bnb-4bit

matatonic commented 1 month ago

I tried this model, I had the same issue as with --load-in-4bit, it had a type conflict. you can try to load it yourself, without any extra arguments, it doesn't work. This is something that I think the model maker will need to fix, but if anyone knows a fix I would be happy to make the changes.

Jonseed commented 1 month ago

The model maker said "The issue arises during the image conversion process for the visual tokenizer. The preprocess_image function in the modeling_ovis.py script fails to properly convert the images to the required format or type for the visual tokenizer." They then said they got it to work. Maybe they would be willing to share how they fixed it.

Jonseed commented 2 weeks ago

There are now official 4-bit versions available:

https://huggingface.co/AIDC-AI/Ovis1.6-Gemma2-9B-GPTQ-Int4

https://huggingface.co/AIDC-AI/Ovis1.6-Llama3.2-3B-GPTQ-Int4