Open pjrpjr opened 5 months ago
I second this. I have run the live demo locally with the provided inference code, and it provides considerably better tag/caption's than any of the models in taggui that can run in 24GB VRAM. Its also very fast even at FP16 (about 3s/caption for me on a 3090)
Easily my favorite model I have worked with. If I knew how to code in any capacity, I would try my best to help taggui support it
have you tried the new model"[GLM-4v-9B]"]((https://huggingface.co/THUDM/glm-4v-9b)),which was released few days ago.it does better than anyother model
I third this! It seems to do a very good job, based on a few test with the online model.
PS: Just started using TagGUI over the weekend and am super impressed with the speed and usability of the program. Amazing!
btw it's available in ollama
Please add support for this model. https://github.com/OpenBMB/MiniCPM-V
this is the demo page for testing it https://huggingface.co/spaces/openbmb/MiniCPM-Llama3-V-2_5
This model has a good combination of efficiency and quality in processing image subtitles. https://huggingface.co/spaces/opencompass/open_vlm_leaderboard
please add support for it