Open sipie800 opened 1 month ago
To infer this model, the xformer library must be used, which poses a significant challenge for Windows users. For versions v1 and v2, it is not straightforward to remove xformer without compromising quality (as we have tested before). We will try removing this dependency in future versions of the model.
Thanks. I do have xformer installed. And use it in other models such as stable diffusion nicely. The issue is the FastRotaryEmbedding use triton. I've no idea what is the connection between xformer and triton, or can it simply go with another FastRoraryEmbedding impl rather than the triton one?
Thanks. I do have xformer installed. And use it in other models such as stable diffusion nicely. The issue is the FastRotaryEmbedding use triton. I've no idea what is the connection between xformer and triton, or can it simply go with another FastRotaryEmbedding impl rather than the triton one?
same issue here. by the way v1 works perfect for me on windows even with 4bit
Feature request / 功能建议
It uses FastRotaryEmbedding, which relys on Triton. However triton is not available in win now. We may found some unofficial build Triton win10 wheels around. Tested and they doesn't support cogvlm(neither v1 and v2). Many many applications need to be deployed on win. Linux is limited to some web applications. Local LLM is the future. Please pay more attention to local deployment support.
Motivation / 动机
none
Your contribution / 您的贡献
none