THUDM / CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B
Apache License 2.0
1.93k stars 121 forks source link

win10 support #154

Open sipie800 opened 1 month ago

sipie800 commented 1 month ago

Feature request / 功能建议

It uses FastRotaryEmbedding, which relys on Triton. However triton is not available in win now. We may found some unofficial build Triton win10 wheels around. Tested and they doesn't support cogvlm(neither v1 and v2). Many many applications need to be deployed on win. Linux is limited to some web applications. Local LLM is the future. Please pay more attention to local deployment support.

Motivation / 动机

none

Your contribution / 您的贡献

none

zRzRzRzRzRzRzR commented 1 month ago

To infer this model, the xformer library must be used, which poses a significant challenge for Windows users. For versions v1 and v2, it is not straightforward to remove xformer without compromising quality (as we have tested before). We will try removing this dependency in future versions of the model.

sipie800 commented 1 month ago

Thanks. I do have xformer installed. And use it in other models such as stable diffusion nicely. The issue is the FastRotaryEmbedding use triton. I've no idea what is the connection between xformer and triton, or can it simply go with another FastRoraryEmbedding impl rather than the triton one?

sipie800 commented 1 month ago

Thanks. I do have xformer installed. And use it in other models such as stable diffusion nicely. The issue is the FastRotaryEmbedding use triton. I've no idea what is the connection between xformer and triton, or can it simply go with another FastRotaryEmbedding impl rather than the triton one?

FurkanGozukara commented 1 month ago

same issue here. by the way v1 works perfect for me on windows even with 4bit

https://github.com/THUDM/CogVLM2/issues/169