casper-hansen / AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
https://casper-hansen.github.io/AutoAWQ/
MIT License
1.62k stars 194 forks source link

weird error with python convert-to-awq.py #422

Closed silvacarl2 closed 5 months ago

silvacarl2 commented 5 months ago

using this example:

from awq import AutoAWQForCausalLM from transformers import AutoTokenizer

model_path = 'lmsys/vicuna-7b-v1.5' quant_path = 'vicuna-7b-v1.5-awq' quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }

Load model

model = AutoAWQForCausalLM.from_pretrained(model_path) tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

Quantize

model.quantize(tokenizer, quant_config=quant_config)

Save quantized model

model.save_quantized(quant_path) tokenizer.save_pretrained(quant_path)

trying to convert a local model to AWQ:

from awq import AutoAWQForCausalLM from transformers import AutoTokenizer

model_path = './cmd-merged-model' quant_path = './cmd-merged-model-awq' quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }

Load model

model = AutoAWQForCausalLM.from_pretrained(model_path) tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

Quantize

model.quantize(tokenizer, quant_config=quant_config)

Save quantized model

model.save_quantized(quant_path) tokenizer.save_pretrained(quant_path)

we get this error:

Loading checkpoint shards: 100%|█████████████████| 3/3 [00:02<00:00, 1.09it/s] Traceback (most recent call last): File "convert-to-awq.py", line 13, in model.quantize(tokenizer, quant_config=quant_config) File "/home/silvacarl/.local/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/home/silvacarl/.local/lib/python3.8/site-packages/awq/models/base.py", line 162, in quantize self.quantizer = AwqQuantizer( File "/home/silvacarl/.local/lib/python3.8/site-packages/awq/quantize/quantizer.py", line 59, in init self.modules, self.module_kwargs, self.inps = self.init_quant() File "/home/silvacarl/.local/lib/python3.8/site-packages/awq/quantize/quantizer.py", line 437, in init_quant samples = get_calib_dataset( File "/home/silvacarl/.local/lib/python3.8/site-packages/awq/utils/calib_data.py", line 17, in get_calib_dataset dataset = load_dataset("mit-han-lab/pile-val-backup", split="validation") File "/home/silvacarl/.local/lib/python3.8/site-packages/datasets/load.py", line 2556, in load_dataset builder_instance = load_dataset_builder( File "/home/silvacarl/.local/lib/python3.8/site-packages/datasets/load.py", line 2228, in load_dataset_builder dataset_module = dataset_module_factory( File "/home/silvacarl/.local/lib/python3.8/site-packages/datasets/load.py", line 1879, in dataset_module_factory raise e1 from None File "/home/silvacarl/.local/lib/python3.8/site-packages/datasets/load.py", line 1854, in dataset_module_factory return HubDatasetModuleFactoryWithoutScript( File "/home/silvacarl/.local/lib/python3.8/site-packages/datasets/load.py", line 1206, in get_module dataset_readme_path = cached_path( File "/home/silvacarl/.local/lib/python3.8/site-packages/datasets/utils/file_utils.py", line 190, in cached_path output_path = get_from_cache( File "/home/silvacarl/.local/lib/python3.8/site-packages/datasets/utils/file_utils.py", line 493, in get_from_cache os.makedirs(cache_dir, exist_ok=True) File "/usr/lib/python3.8/os.py", line 223, in makedirs mkdir(name, mode) PermissionError: [Errno 13] Permission denied: '/home/silvacarl/.cache/huggingface/datasets/downloads'

any ideas?

silvacarl2 commented 5 months ago

ignore found it.