OpenMOSS / MOSS

An open-source tool-augmented conversational language model from Fudan University
https://txsun1997.github.io/blogs/moss.html
Apache License 2.0
11.95k stars 1.15k forks source link

torch1.10.1,cuda11.3,推理时报错RuntimeError: CUDA error:no kernel image...是因为显存不够吗,3080显卡 #183

Open Ben2522662 opened 1 year ago

Ben2522662 commented 1 year ago

python moss_cli_demo.py Fetching 17 files: 100%|██████████████████████████████████████████████████████████████████████████████████| 17/17 [00:00<00:00, 83787.51it/s] Waiting for all devices to be ready, it may take a few minutes... 欢迎使用 MOSS 人工智能助手!输入内容即可进行对话。输入 clear 以清空对话历史,输入 stop 以终止对话。 <|Human|>: hello Traceback (most recent call last): File "moss_cli_demo.py", line 89, in main() File "moss_cli_demo.py", line 72, in main outputs = model.generate( File "/root/miniconda3/envs/moss2/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/root/miniconda3/envs/moss2/lib/python3.8/site-packages/transformers/generation/utils.py", line 1358, in generate if pad_token_id is not None and torch.sum(inputs_tensor[:, -1] == pad_token_id) > 0: RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

yhyu13 commented 1 year ago

requirement.txt中要求pytorch==1.13.1,这个py版本好像已经没有cuda11.3了,应该用pytorch-1.13.1cu117。 同时,没有应用量化的话,fp16需要32G显存,3080肯定不够的 运行量化版本需要编译gptq,需要安装cuda tool kit进行编译,和pytorch一个cuda版本