gokayfem / ComfyUI_VLM_nodes

Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation
Apache License 2.0
297 stars 23 forks source link

Kosmos 2.5 released - Possible to add support for it? #85

Closed CCpt5 closed 1 month ago

CCpt5 commented 1 month ago

Hello,

Thanks for your time and effort on this node!

I noticed that Microsoft quietly dropped the model for Kosmos 2.5 yesterday here: https://github.com/microsoft/unilm/tree/master/kosmos-2.5 https://huggingface.co/microsoft/kosmos-2.5

The paper was published last September: https://arxiv.org/abs/2309.11419

Curious if you have plans to make it usable with this node or for ComfyUI in any fashion?

Thanks for any info!!

CCpt5 commented 1 month ago

Or perhaps just renaming the new model works? Seems like it may in spite of the ckpt vs safetensors extension.

gokayfem commented 1 month ago

there are a lot of new vlms released in 2 months, im planning to add some of them, mantis, llava lama3, llava phi3, idefics 8b, deepseekvl, qwen vl, internlm 4khd, nanollava, paligemma etc..