casper-hansen / AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
https://casper-hansen.github.io/AutoAWQ/
MIT License
1.78k stars 217 forks source link

bloomz_7b1 error message TypeError: forward() missing 1 required positional argument: 'alibi' #288

Open oreojason opened 10 months ago

oreojason commented 10 months ago

RUNNING

from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer

model_path = '/data/bloomz_7b1'
quant_path = '/data/bloomz_7b1_4bit'
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 8, "version": "GEMM"}

# Load model
# NOTE: pass safetensors=True to load safetensors
model = AutoAWQForCausalLM.from_pretrained(
    model_path, **{"low_cpu_mem_usage": True, "use_cache": False}
)
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

# Quantize
model.quantize(tokenizer, quant_config=quant_config,text_column="text")

# Save quantized model
model.save_quantized(quant_path)
tokenizer.save_pretrained(quant_path)

ERROR:

/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/huggingface_hub/repocard.py:105: UserWarning: Repo card metadata block was not found. Setting CardData to empty.
  warnings.warn("Repo card metadata block was not found. Setting CardData to empty.")
AWQ:   0%|                                                                                                                                                          | 0/30 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/data/LLaMA-Factory/autoawq_demo/quan_demo.py", line 20, in <module>
    model.quantize(tokenizer, quant_config=quant_config,text_column="text")
  File "/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/awq/models/base.py", line 93, in quantize
    quantizer.quantize()
  File "/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/awq/quantize/quantizer.py", line 95, in quantize
    input_feat = self._get_input_feat(self.modules[i], named_linears)
  File "/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/awq/quantize/quantizer.py", line 406, in _get_input_feat
    self.inps = layer(self.inps, **module_kwargs)[0]
  File "/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
TypeError: forward() missing 1 required positional argument: 'alibi'

How should I resolve this error?

THX

casper-hansen commented 10 months ago

Which transformers version are you using? Could you try 4.34.1 and 4.35.2? A little background... recently the 4.36 version broke a lot of things around how we cache arguments in AutoAWQ, which we mostly fixed but there are still edge cases like this.

CC: @younesbelkada

oreojason commented 10 months ago

Which transformers version are you using? Could you try 4.34.1 and 4.35.2? A little background... recently the 4.36 version broke a lot of things around how we cache arguments in AutoAWQ, which we mostly fixed but there are still edge cases like this.

CC: @younesbelkada

At first, I was using 4.36.2, but I have now tried 4.34.1 and 4.35.2 without resolving the issue. 4.34.1 ImportError: cannot import name 'insecure_hashlib' from 'huggingface_hub.utils' (/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/huggingface_hub/utils/__init__.py) 4.35.2 ImportError: cannot import name 'MoeModelOutputWithPast' from 'transformers.modeling_outputs' (/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/transformers/modeling_outputs.py)

franchukpetro commented 8 months ago

Hi, I've similar issue while quantazing Falcon1B model:

TypeError: FalconDecoderLayer.forward() missing 1 required positional argument: 'alibi'

Here is my code:

from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer

model_path = "tiiuae/falcon-rw-1b"
quant_path = 'falcon-rw-1b-awq'
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }

# Load model
model = AutoAWQForCausalLM.from_pretrained(model_path, **{"low_cpu_mem_usage": True})
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

# Quantize
model.quantize(tokenizer, quant_config=quant_config)

# Save quantized modela
model.save_quantized(quant_path)
tokenizer.save_pretrained(quant_path)

My setup:

@oreojason have you managed to fix this issue, or probably you @casper-hansen already figured out what was the reason for such behaviour?

49Simon commented 8 months ago

Hi, I still have not figured out what the issue is. I tried with Falcon-7B and it throws the same error. Quantizing Falcon-7B works with autoawq==0.1.7 (and prior, 0.1.6 also worked). Give it a try.

My setup: torch==2.1.2 torchvision==0.16.2 autoawq==0.1.7 autoawq_kernels==0.0.6

The problem for me is, converting AWQ to GGUF. This support is added in autoawq==0.2.0. However, applying AWQ scales fails by throwing this error: TypeError: FalconDecoderLayer.forward() missing 1 required positional argument: 'alibi'.

I have tried autoawq>=0.2.0 and it still fails. Any idea @casper-hansen?

casper-hansen commented 8 months ago

It seems the implementation broke a while ago. Unfortunately, I do not currently have the capacity to research old models that break with new updates. I will welcome all PRs and help review them if you want to research how to fix the issue

Chinni2103 commented 4 months ago

I have tried awq on falcon-1b and it is giving me the same issue. TypeError: FalconDecoderLayer.forward() missing 1 required positional argument: 'alibi'. my versions:- autoawq==0.2.5 autoawq_kernels==0.0.6 when tried with other versions it is giving me the error OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory with the versions torch==2.1.2 torchvision==0.16.2 autoawq==0.1.7 autoawq_kernels==0.0.6

@casper-hansen @franchukpetro @49Simon @oreojason has anyone managed to fix this issue.