CUDA error: no kernel image is available for execution on the device

Hello, thank you for your work. When I run bash launch_chatglm_cmd.sh I've got the error:

RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Please Enter Your Name:

用户名：Tiantian 欢迎新用户Tiantian！我会记住你的名字，下次见面就能叫你的名字啦！ Welcome to use SiliconFriend model，please enter your question to start conversation，enter "clear" to clear conversation ，enter "stop" to stop program

Tiantian：tell me a joke Traceback (most recent call last): File "/home/Hongwei/disk_hdd/Tiantian/LLM_Memory/MemoryBank-SiliconFriend-main/SiliconFriend-ChatGLM-BELLE/cli_demo.py", line 212, in main() File "/home/Hongwei/disk_hdd/Tiantian/LLM_Memory/MemoryBank-SiliconFriend-main/SiliconFriend-ChatGLM-BELLE/cli_demo.py", line 198, in main history_state, history, msg = predict_new(text=query,history=history,top_p=0.95,temperature=1,max_length_tokens=1024,max_context_length_tokens=200,user_name=user_name,user_memory=user_memory,user_memory_index=user_memory_index) File "/home/Hongwei/disk_hdd/Tiantian/LLM_Memory/MemoryBank-SiliconFriend-main/SiliconFriend-ChatGLM-BELLE/cli_demo.py", line 148, in predict_new response = chat(model,tokenizer,text,history=history, File "/home/Hongwei/disk_hdd/Tiantian/LLM_Memory/MemoryBank-SiliconFriend-main/SiliconFriend-ChatGLM-BELLE/cli_demo.py", line 106, in chat outputs = model.generate(inputs, gen_kwargs) File "/home/Hongwei/anaconda3/lib/python3.9/site-packages/peft/peft_model.py", line 1022, in generate outputs = self.base_model.generate(*kwargs) File "/home/Hongwei/anaconda3/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(args, **kwargs) File "/home/Hongwei/anaconda3/lib/python3.9/site-packages/transformers/generation/utils.py", line 1308, in generate model_kwargs["attention_mask"] = self._prepare_attention_mask_for_generation( File "/home/Hongwei/anaconda3/lib/python3.9/site-packages/transformers/generation/utils.py", line 603, in _prepare_attention_mask_for_generation is_pad_token_in_inputs = (pad_token_id is not None) and (pad_token_id in inputs) File "/home/Hongwei/anaconda3/lib/python3.9/site-packages/torch/_tensor.py", line 703, in contains return (element == self).any().item() # type: ignore[union-attr] RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

zhongwanjun / MemoryBank-SiliconFriend

CUDA error: no kernel image is available for execution on the device #10

RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.