belovedxixi commented 1 month ago

Describe the bug Awq model export for cpu is not supported

To Reproduce python3 builder.py -i awq_model_dir -o output_folder -p int4 -e cpu -c cache_dir

Screenshots 屏幕截图 2024-08-13 175152

Additional context It seems like hf 's file can not support awq_config for cpu

kunal-vaishnavi commented 1 month ago

According to Hugging Face, this error is raised when torch.cuda.is_available() = False. When torch.cuda.is_available() = False, this means that torch cannot find CUDA installed on your machine.

As you mentioned, it appears that Hugging Face requires that the machine that loads an AwqConfig must have CUDA installed. Here are a couple of workaround options.

If your machine supports installing CUDA, verify that CUDA is installed and that torch.cuda.is_available() = True. Then run your command again.
If your machine doesn't support installing CUDA, modify transformers so that post_init() is not called to avoid raising the error. Here are some steps to do this.
```
# 1) Uninstall existing `transformers` package
$ pip uninstall -y transformers
```

2) Clone `transformers` repo

$ git clone https://github.com/huggingface/transformers

3) Navigate to the folder that `quantization_config.py` is in

$ cd transformers/src/transformers/utils/

3) Comment out `self.post_init()` in `quantization_config.py`. Here's the location:

https://github.com/huggingface/transformers/blob/9d2ab8824c2a820a0ac9f2cd884ef838ca77b481/src/transformers/utils/quantization_config.py#L814

#

This will prevent the post-init steps from running and this error should not get raised.

4) Install `transformers` from source

$ cd ../../../ $ pip install -e .


Then run your command again.

natke commented 2 weeks ago

@belovedxixi Did you try the above suggestions? Let us know how you went

microsoft / onnxruntime-genai

Awq model export for cpu is not supported in wins #787

2) Clone `transformers` repo

3) Navigate to the folder that `quantization_config.py` is in

3) Comment out `self.post_init()` in `quantization_config.py`. Here's the location:

https://github.com/huggingface/transformers/blob/9d2ab8824c2a820a0ac9f2cd884ef838ca77b481/src/transformers/utils/quantization_config.py#L814

This will prevent the post-init steps from running and this error should not get raised.

4) Install `transformers` from source

microsoft / onnxruntime-genai

Awq model export for cpu is not supported in wins #787

2) Clone transformers repo

3) Navigate to the folder that quantization_config.py is in

3) Comment out self.post_init() in quantization_config.py. Here's the location:

https://github.com/huggingface/transformers/blob/9d2ab8824c2a820a0ac9f2cd884ef838ca77b481/src/transformers/utils/quantization_config.py#L814

This will prevent the post-init steps from running and this error should not get raised.

4) Install transformers from source

2) Clone `transformers` repo

3) Navigate to the folder that `quantization_config.py` is in

3) Comment out `self.post_init()` in `quantization_config.py`. Here's the location:

4) Install `transformers` from source