Closed BenderBlender closed 8 months ago
The author has already done it himself) https://github.com/shadowcz007/comfyui-moondream
Damn this is interesting! Thanks for share! I didn0t know that there are such a small models able to do that (i was suck to the main LLava model that is great but also really heavy) , i will definitly check it out!
I collected a work flow with your node, which draws itself and then tries to improve its creation using this pattern recognition model )))
Very nice!
The guys have already added a 2nd model, it seems even better than the first. Much better I would say... https://github.com/zhongpei/Comfyui_image2prompt
Please advise a good model (GGUF GPT) that will work with your node. There are so many of them that my eyes are running wild. Maybe there is some kind of rating?
Please advise a good model (GGUF GPT) that will work with your node. There are so many of them that my eyes are running wild. Maybe there is some kind of rating?
As you said there are too many of them so i really cannot say which is better than which but yes there are ALOT of benchmark tests used for rating an llm models, you can have a summary here https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard or here: https://eqbench.com/
ok i've added support for llava model, moondream and joytag. I'm trying to add interlm (i did most of the code ) but i'm having some issues, plus being a heavier model I have to give it less priority so I think I will add it but not right away
I wanted to use “blip analyze image” in my workflow, but after the next comfyui updates this node unfortunately stopped working. But an excellent neural network model with vision support has appeared (Local Tiny AI Vision Language Model (1.6B)).
It would be great if you could add support for this model to the nodes! https://github.com/vikhyat/moondream?tab=readme-ov-file
https://www.youtube.com/watch?v=oDGQrOlmC1s