Open bf96163 opened 1 month ago
@bf96163 look at this pr: https://huggingface.co/THUDM/glm-4-9b/discussions/4/files, you can do the same changes on your local modeling_chatglm.py
, it should have fixed your NoneType
issue.
@bf96163 look at this pr: https://huggingface.co/THUDM/glm-4-9b/discussions/4/files, you can do the same changes on your local
modeling_chatglm.py
, it should have fixed yourNoneType
issue.
Sorry, I tried the newest version of modeling_chatglm.py
the problem still there......
File "/root/.cache/huggingface/modules/transformers_modules/glm-4-9b-chat/modeling_chatglm.py", line 998, in forward transformer_outputs = self.transformer( File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "/root/.cache/huggingface/modules/transformers_modules/glm-4-9b-chat/modeling_chatglm.py", line 896, in forward hidden_states, presents, all_hidden_states, all_self_attentions = self.encoder( File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "/root/.cache/huggingface/modules/transformers_modules/glm-4-9b-chat/modeling_chatglm.py", line 726, in forward layer_ret = layer( File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1561, in _call_impl args_kwargs_result = hook(self, args, kwargs) # type: ignore[misc] File "/usr/local/lib/python3.10/dist-packages/gptqmodel/models/base.py", line 367, in store_input_hook layer_input.append(move_to(inp, data_device)) File "/usr/local/lib/python3.10/dist-packages/gptqmodel/utils/model.py", line 74, in move_to if get_device(obj) != device: File "/usr/local/lib/python3.10/dist-packages/gptqmodel/utils/model.py", line 70, in get_device return next(obj.parameters()).device AttributeError: 'NoneType' object has no attribute 'parameters'
@bf96163 Have u solved it?
Describe the bug
when run line:model.quantize(examples) got AttributeError: 'NoneType' object has no attribute 'parameters'
GPU Info
Show output of:
Software Info
nvidia-docker inside ubuntu(22.04) host: container-image(nvidia/cuda:12.1.0-runtime-ubuntu22.04) host has cuda 12.5 driver 555.58 Show output of:
If you are reporting an inference bug of a post-quantized model, please post the content of
config.json
andquantize_config.json
.To Reproduce run following script
Expected behavior
model.quantize(examples) can run with out error
Model/Datasets
https://huggingface.co/THUDM/chatglm3-6b
Screenshots
root@bf-llm-xinfer-vllm-pod-1-6d69cc5b69-hxpbq:/baifan/quant# python3 quantization_gptq.py
[HAMI-core Warn(83428:140567479620672:utils.c:183)]: get default cuda from (null)
{'device_map': 'cpu', 'trust_remote_code': True, 'torch_dtype': torch.float16} Loading checkpoint shards: 100%|█████████████████████████████████████████████████████| 7/7 [00:01<00:00, 4.94it/s] WARNING - Calibration dataset size should be greater than 256. Current size: 1. WARNING - The average length of input_ids of calibration_dataset should be greater than 256: actual avg: 19.0. WARNING - Model config does not have pad token mapped. Please pass in tokenizer to
model.quantize(examples)
File "/usr/local/lib/python3.10/dist-packages/gptqmodel/models/base.py", line 410, in quantize
self.model(example)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(args, kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 937, in forward
transformer_outputs = self.transformer(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, *kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 830, in forward
hidden_states, presents, all_hidden_states, all_self_attentions = self.encoder(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(args, kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, *kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 640, in forward
layer_ret = layer(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1561, in _call_impl
args_kwargs_result = hook(self, args, kwargs) # type: ignore[misc]
File "/usr/local/lib/python3.10/dist-packages/gptqmodel/models/base.py", line 367, in store_input_hook
layer_input.append(move_to(inp, data_device))
File "/usr/local/lib/python3.10/dist-packages/gptqmodel/utils/model.py", line 74, in move_to
if get_device(obj) != device:
File "/usr/local/lib/python3.10/dist-packages/gptqmodel/utils/model.py", line 70, in get_device
return next(obj.parameters()).device
AttributeError: 'NoneType' object has no attribute 'parameters'
[HAMI-core Msg(83428:140567479620672:multiprocess_memory_limit.c:468)]: Calling exit handler 83428
quantize()
so GPTQModel can auto-select the best pad token. Traceback (most recent call last): File "/baifan/quant/quantization_gptq.py", line 115, inAdditional context
already tried (but no effect this time):