Closed msqp closed 4 months ago
设置 ASCEND_LAUNCH_BLOCKING=1 重新发一下报错
报错如下,感谢支持~
05/17/2024 21:03:59 - INFO - llmtuner.api.chat - ==== request ====
{
"model": "",
"messages": [
{
"role": "user",
"content": "hello"
}
]
}
INFO: - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/uvicorn/protocols/http/httptools_impl.py", line 411, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in __call__
return await self.app(scope, receive, send)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/fastapi/applications.py", line 1054, in __call__
await super().__call__(scope, receive, send)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/starlette/applications.py", line 123, in __call__
await self.middleware_stack(scope, receive, send)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/starlette/middleware/errors.py", line 186, in __call__
raise exc
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/starlette/middleware/errors.py", line 164, in __call__
await self.app(scope, receive, _send)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/starlette/middleware/cors.py", line 93, in __call__
await self.simple_response(scope, receive, send, request_headers=headers)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/starlette/middleware/cors.py", line 148, in simple_response
await self.app(scope, receive, send)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/starlette/routing.py", line 756, in __call__
await self.middleware_stack(scope, receive, send)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/starlette/routing.py", line 776, in app
await route.handle(scope, receive, send)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/starlette/routing.py", line 297, in handle
await self.app(scope, receive, send)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/starlette/routing.py", line 77, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/starlette/routing.py", line 72, in app
response = await func(request)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/fastapi/routing.py", line 278, in app
raw_response = await run_endpoint_function(
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
return await dependant.call(**values)
File "/codex/llama-factory/src/llmtuner/api/app.py", line 85, in create_chat_completion
return await create_chat_completion_response(request, chat_model)
File "/codex/llama-factory/src/llmtuner/api/chat.py", line 101, in create_chat_completion_response
responses = await chat_model.achat(
File "/codex/llama-factory/src/llmtuner/chat/chat_model.py", line 56, in achat
return await self.engine.chat(messages, system, tools, image, **input_kwargs)
File "/codex/llama-factory/src/llmtuner/chat/hf_engine.py", line 252, in chat
return await loop.run_in_executor(pool, self._chat, *input_args)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/codex/llama-factory/src/llmtuner/chat/hf_engine.py", line 142, in _chat
generate_output = model.generate(**gen_kwargs)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/transformers/generation/utils.py", line 1431, in generate
model_kwargs["attention_mask"] = self._prepare_attention_mask_for_generation(
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/transformers/generation/utils.py", line 463, in _prepare_attention_mask_for_generation
is_pad_token_in_inputs = (pad_token_id is not None) and (pad_token_id in inputs)
File "/miniconda3/envs/llama_factory_py3.9/lib/python3.9/site-packages/torch/_tensor.py", line 1091, in __contains__
return (element == self).any().item() # type: ignore[union-attr]
RuntimeError: call aclnnEqScalar failed, detail:EZ9999: Inner Error!
EZ9999: 2024-05-17-21:03:59.298.450 Cannot parse json for config file [/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe//kernel/config/ascend910/equal.json].
TraceBack (most recent call last):
Failed to parse kernel in equal.json.
AclOpKernelInit failed opType
Op Equal does not has any binary.
Kernel Run failed. opType: 88, Equal
launch failed for Equal, errno:561000.
[ERROR] 2024-05-17-21:03:59 (PID:183739, Device:0, RankID:-1) ERR01005 OPS internal error
安装Ascend-cann-kernels-910_8.0.RC1_linux后解决~
训练出现类似错误,请帮忙看看,感谢支持! 训练命令: ASCEND_RT_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 llamafactory-cli train examples/full_multi_gpu/interlm2_full_sft.yaml Ascend-cann-kernels版本: Ascend-cann-kernels-910b_8.0.RC1.alpha003_linux.run 信息如下: RuntimeError: call aclnnCast failed, detail:EZ9999: Inner Error! EZ9999 Cannot parse json for config file [/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe//kernel/config/ascend910/cast.json]. TraceBack (most recent call last): Failed to parse kernel in cast.json. AclOpKernelInit failed opType Op Cast does not has any binary. Kernel Run failed. opType: 53, Cast launch failed for Cast, errno:561000.
Reminder
Reproduction
华为昇腾NPU910b使用API推理时,能够正常启动。但通过API调用报错,错误信息:
运行命令:
Expected behavior
No response
System Info
torch-npu=2.2.0 torch=2.2.0 Ascend-cann-toolkit_8.0.RC1_linux-aarch64 Ascend-cann-kernels-910b_8.0.RC1_linux
尝试更换cann的版本还是不行
Others
No response