ymcui / Chinese-LLaMA-Alpaca-2

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
Apache License 2.0
7.04k stars 581 forks source link

longbench error: RuntimeError: CUDA error: device-side assert triggered #385

Closed panpanli521 closed 10 months ago

panpanli521 commented 10 months ago

提交前必须检查以下项目

问题类型

模型推理

基础模型

Chinese-LLaMA-2 (7B/13B)

操作系统

Linux

详细描述问题

执行命令:
model_path=chinese-llama-2-13b
data_class=zh
output_path=output
with_inst="false" # or "false" or "auto"
max_length=4096
python pred_llama2.py \
    --model_path ${model_path} \
    --predict_on ${data_class} \
    --output_dir ${output_path} \
    --with_inst ${with_inst} \
    --max_length ${max_length}

依赖情况(代码类问题务必提供)

bitsandbytes            0.39.0
peft                    0.6.0.dev0
sentence-transformers   2.2.2
sentencepiece           0.1.99
torch                   2.0.1
torchaudio              2.0.2
torchdata               0.6.1
torchelastic            0.2.2
torchtext               0.15.2
torchvision             0.15.2
transformers            4.35.0.dev0 /workspace/transformers

运行日志或截图

![image](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/assets/15307422/2116ca85-4ce7-47a5-b85e-1f32bee985dd)
iMountTai commented 10 months ago

图片挂了

panpanli521 commented 10 months ago

/opt/conda/conda-bld/pytorch_1682343967769/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1182,0,0], thread: [51,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1682343967769/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1182,0,0], thread: [52,0,0 layer_outputs = decoder_layer( ] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl /opt/conda/conda-bld/pytorch_1682343967769/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1182,0,0], thread: [53,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1682343967769/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1182,0,0], thread: [54,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1682343967769/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1182,0,0], thread: [55,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1682343967769/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1182,0,0], thread: [56,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1682343967769/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1182,0,0], thread: [57,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1682343967769/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1182,0,0], thread: [58,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1682343967769/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1182,0,0], thread: [59,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1682343967769/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1182,0,0], thread: [60,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1682343967769/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1182,0,0], thread: [61,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1682343967769/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1182 return forward_call(*args, **kwargs)<br> ,0,0 File "/opt/conda/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward ], thread: [62,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. /opt/conda/conda-bld/pytorch_1682343967769/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1182,0,0], thread: [63,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds"failed. output = old_forward(*args, **kwargs) File "/workspace/transformers/src/transformers/models/llama/modeling_llama.py", line 633, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/mnt/cluster/ppl/eval_chinese/Chinese-LLaMA-Alpaca-2-main/scripts/attn_and_long_ctx_patches.py", line 58, in xformers_forward query_states, key_states = apply_rotary_pos_emb(query_states, key_states, cos, sin, position_ids) File "/workspace/transformers/src/transformers/models/llama/modeling_llama.py", line 211, in apply_rotary_pos_emb q_embed = (q * cos) + (rotate_half(q) * sin) RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile withTORCH_USE_CUDA_DSAto enable device-side assertions.

panpanli521 commented 10 months ago

@iMountTai 麻烦看下哈

iMountTai commented 10 months ago

transformer回退到4.31.0试试?

panpanli521 commented 10 months ago

transformer回退到4.31.0试试?

回退版本后,可以跑啦,多谢