OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
MIT License
721 stars 55 forks source link

The version of transformers, auto_gptq, autoawq #88

Open zhangfzR opened 2 months ago

zhangfzR commented 2 months ago

elated to the versions of the transformers, auto_gptq, and autoawq libraries. Here are some specific problems you might face and their solutions:

1.  Error: “but found at least two devices, cpu and cuda:0! (when checking argument for argument mat2 in method wrapper_CUDA_bmm)”
2.  Error: “got an unexpected keyword argument ‘seq_len’” or missing “position_ids”
3.  Error: “missing ‘gemma’ in awq”

These issues are often due to version incompatibilities. Specifically, you’ll need transformers version less than 4.38.0 because versions above that may cause problems 1 and 2. However, transformers versions below 4.38.0 do not include the “gemma” model, which was added in 4.38.0. And autoawq added gemma model in 0.2.4. Thus, you can’t have transformers < 4.38.0 and autoawq < 0.2.4 in the same environment.

To resolve these issues, set transformers to version 4.37.2 and autoawq to version 0.2.3. This combination has worked for me: pip install transformers==4.37.2 autoawq==0.2.3