python moss_cli_demo.py
Fetching 17 files: 100%|██████████████████████████████████████████████████████████████████████████████████| 17/17 [00:00<00:00, 83787.51it/s]
Waiting for all devices to be ready, it may take a few minutes...
欢迎使用 MOSS 人工智能助手!输入内容即可进行对话。输入 clear 以清空对话历史,输入 stop 以终止对话。
<|Human|>: hello
Traceback (most recent call last):
File "moss_cli_demo.py", line 89, in
main()
File "moss_cli_demo.py", line 72, in main
outputs = model.generate(
File "/root/miniconda3/envs/moss2/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/root/miniconda3/envs/moss2/lib/python3.8/site-packages/transformers/generation/utils.py", line 1358, in generate
if pad_token_id is not None and torch.sum(inputs_tensor[:, -1] == pad_token_id) > 0:
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
python moss_cli_demo.py Fetching 17 files: 100%|██████████████████████████████████████████████████████████████████████████████████| 17/17 [00:00<00:00, 83787.51it/s] Waiting for all devices to be ready, it may take a few minutes... 欢迎使用 MOSS 人工智能助手!输入内容即可进行对话。输入 clear 以清空对话历史,输入 stop 以终止对话。 <|Human|>: hello Traceback (most recent call last): File "moss_cli_demo.py", line 89, in main() File "moss_cli_demo.py", line 72, in main outputs = model.generate( File "/root/miniconda3/envs/moss2/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/root/miniconda3/envs/moss2/lib/python3.8/site-packages/transformers/generation/utils.py", line 1358, in generate if pad_token_id is not None and torch.sum(inputs_tensor[:, -1] == pad_token_id) > 0: RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.