QwenLM / qwen.cpp

C++ implementation of Qwen-LM
Other
509 stars 41 forks source link

M1 Support #9

Open cylee0909 opened 9 months ago

cylee0909 commented 9 months ago

It raise Import Error when execute python3 qwen_cpp/convert.py xxx command

raise ImportError( ImportError: This modeling file requires the following packages that were not found in your environment: kernels, flash_attn. Run pip install kernels flash_attn

And m1 does not support flash_attn . issue

xwdreamer commented 9 months ago

the same error。

yhyu13 commented 9 months ago

You are doing the conversin on Mac Apple Sillicon, I don't think it is supported. You need to convert with CUDA device, and only inference on Apple sillicon with ggml is supported

Anri-Lombard commented 2 months ago

This is obviously not ideal; would be great if this is adapted for m1+ architectures.