Open Veluriyam opened 1 month ago
Pls try to downgrade the transformers
's version to ~4.37.
Based on your previous issues reconfigured the environment python==3.10, torch==2.1.1, transformers==4.37, used cuda==11.6, and got a similar error as yesterday, but in a different place:
Traceback (most recent call last):
File "/root/yp/LLM-Safeguard-main/code/forward.py", line 212, in TORCH_USE_CUDA_DSA
to enable device-side assertions.
事实上我检查每个hidden_state 的形状:outputs = model( input_ids, attention_mask=input_ids.new_ones(input_ids.size(), dtype=model.dtype), return_dict=True, output_hidden_states=True, )
for i, hidden_state in enumerate(outputs.hidden_states):
print(f"Layer {i} shape: {hidden_state.shape}")
print(f"Layer{i}:{hidden_state}")发现在执行到此时,打印输出第一个Layer0~Layer32 shape:torch.Size([1, 22, 4096]),但是从Layer 19开始-Layer32,tensor变为0,如图,然后打印下一个Layer0时shape变成了[1,21,4096],但tensor均不为0,随后就出现了错误,是shape变化的原因导致的错误吗?
Thanks for you,change the device_map of function "model" from auto into sequential had solve the problem.
bash scripts/forward.sh in Llama-2-7b-chat-hf Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:06<00:00, 3.10s/it] [2024-09-23 17:22:48,125] [forward.py:111] Model name: Llama-2-7b-chat-hf [2024-09-23 17:22:48,132] [forward.py:112] Model size: 13.543948288 [2024-09-23 17:22:48,133] [utils.py:94] GPU 0: 6.88 GB / 32.00 GB [2024-09-23 17:22:48,133] [utils.py:94] GPU 1: 6.88 GB / 32.00 GB [2024-09-23 17:22:48,273] [forward.py:173] Running 0%| | 0/100 [00:00<?, ?it/s]../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1,0,0], thread: [96,0,0] Assertion
main()
File "forward.py", line 177, in main
hidden_states = forward(model, toker, messages)
File "forward.py", line 52, in forward
outputs = model(
File "/opt/conda/envs/onprompt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, kwargs)
File "/opt/conda/envs/onprompt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, *kwargs)
File "/opt/conda/envs/onprompt/lib/python3.8/site-packages/accelerate/hooks.py", line 170, in new_forward
output = module._old_forward(args, kwargs)
File "/opt/conda/envs/onprompt/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 1183, in forward
outputs = self.model(
File "/opt/conda/envs/onprompt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, kwargs)
File "/opt/conda/envs/onprompt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, *kwargs)
File "/opt/conda/envs/onprompt/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 1070, in forward
layer_outputs = decoder_layer(
File "/opt/conda/envs/onprompt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(args, kwargs)
File "/opt/conda/envs/onprompt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, kwargs)
File "/opt/conda/envs/onprompt/lib/python3.8/site-packages/accelerate/hooks.py", line 170, in new_forward
output = module._old_forward(*args, *kwargs)
File "/opt/conda/envs/onprompt/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 798, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/opt/conda/envs/onprompt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(args, kwargs)
File "/opt/conda/envs/onprompt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, *kwargs)
File "/opt/conda/envs/onprompt/lib/python3.8/site-packages/accelerate/hooks.py", line 170, in new_forward
output = module._old_forward(args, **kwargs)
File "/opt/conda/envs/onprompt/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 706, in forward
query_states, key_states = apply_rotary_pos_emb(query_states, key_states, cos, sin, position_ids)
File "/opt/conda/envs/onprompt/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 232, in apply_rotary_pos_emb
cos = cos[position_ids].unsqueeze(unsqueeze_dim)
RuntimeError: CUDA error: device-side assert triggered
Compile with
-sizes[i] <= index && index < sizes[i] && "index out of bounds"
failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1,0,0], thread: [97,0,0] Assertion-sizes[i] <= index && index < sizes[i] && "index out of bounds"
failed. ... ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [0,0,0], thread: [31,0,0] Assertion-sizes[i] <= index && index < sizes[i] && "index out of bounds"
failed. 0%| | 0/100 [00:01<?, ?it/s] Traceback (most recent call last): File "forward.py", line 212, inTORCH_USE_CUDA_DSA
to enable device-side assertions.What is the cause of this error?Thanks.