intel / auto-round

Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
https://arxiv.org/abs/2309.05516
Apache License 2.0
225 stars 19 forks source link

falcon 7b bug with disable_trust_remote_code #125

Closed wenhuach21 closed 4 months ago

wenhuach21 commented 4 months ago

python3 main.py \ --model_name $model_name \ --device 0 \ --group_size 64 \ --bits 4 \ --iters 1000 \ --eval_bs $eval_bs \ --disable_eval \ --disable_trust_remote_code \ --deployment_device 'gpu,fake' \ --disable_quanted_input \ --disable_low_gpu_mem_usage \ --output_dir "/data5/wenhuach/falcon-7b-iter1000-w4g64-disable_quanted_input" \ 2>&1 | tee -a tmp.txt

Traceback (most recent call last): File "/home/wenhuach/auto-round/examples/language-modeling/main.py", line 312, in model, _ = autoround.quantize() File "/home/wenhuach/auto-round/examples/language-modeling/../../auto_round/autoround.py", line 575, in quantize self.quant_blocks( File "/home/wenhuach/auto-round/examples/language-modeling/../../auto_round/autoround.py", line 1318, in quant_blocks q_input, input_ids = self.quant_block( File "/home/wenhuach/auto-round/examples/language-modeling/../../auto_round/autoround.py", line 1137, in quant_block output = self.get_block_outputs(block, input_ids, input_others, self.train_bs, device, self.cache_device) File "/home/wenhuach/anaconda3/envs/autoround/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/home/wenhuach/auto-round/examples/language-modeling/../../auto_round/autoround.py", line 744, in get_block_outputs tmp_output = block_forward(block, tmp_input_ids, tmp_input_others, self.amp, self.amp_dtype, device).to( File "/home/wenhuach/auto-round/examples/language-modeling/../../auto_round/utils.py", line 479, in block_forward attention_mask = input_others["attention_mask"] KeyError: 'attention_mask'

WeiweiZhang1 commented 4 months ago

https://github.com/intel/auto-round/pull/126