👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
Traceback (most recent call last):
File "./tools/inference.py", line 53, in <module>
outs = engine.inference(data)
File "/paddle/PaddleFleetX/ppfleetx/core/engine/eager_engine.py", line 864, in inference
return self._inference_engine.predict(data)
File "/paddle/PaddleFleetX/ppfleetx/core/engine/inference_engine.py", line 260, in predict
handle.copy_from_cpu(np.array(d.copy()))
AttributeError: 'Tensor' object has no attribute 'copy'
根据报错信息,对代码进行以下修改:
--- a/ppfleetx/core/engine/inference_engine.py
+++ b/ppfleetx/core/engine/inference_engine.py
@@ -257,7 +257,7 @@
raise ValueError()
for d, name in zip(data, self.input_names()):
handle = self.predictor.get_input_handle(name)
- handle.copy_from_cpu(np.array(d.copy()))
+ handle.copy_from_cpu(np.array(d))
elif isinstance(data, Mapping):
# key check
for k, v in data.items():
Traceback (most recent call last):
File "./tools/inference.py", line 53, in <module>
outs = engine.inference(data)
File "/paddle/PaddleFleetX/ppfleetx/core/engine/eager_engine.py", line 864, in inference
return self._inference_engine.predict(data)
File "/paddle/PaddleFleetX/ppfleetx/core/engine/inference_engine.py", line 269, in predict
self.predictor.run()
OSError: (External) CUBLAS error(1).
[Hint: Please search for the error code(1) on website (https://docs.nvidia.com/cuda/cublas/index.html#cublasstatus_t) to get Nvidia's official solution and advice about CUBLAS Error.] (at /paddle/paddle/paddle/phi/backends/gpu/gpu_resources.cc:185)
[operator < multihead_matmul > error]
terminate called after throwing an instance of 'phi::enforce::EnforceNotMet'
what(): (External) CUDA error(700), an illegal memory access was encountered.
[Hint: Please search for the error code(700) on website (https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1g3f51e3575c2178246db0a94a430e0038) to get Nvidia's official solution and advice about CUDA Error.] (at /paddle/paddle/paddle/fluid/platform/device/gpu/gpu_info.cc:271)
Aborted (core dumped)
请提出你的问题
按文档(model_zoo/gpt-3/docs/quick_start.md)跑GPT模型,依次执行如下命令:
在docker环境内部
使用以下命令跑training,可以正常运行
使用以下命令跑inference
报错找不到
Tensor.copy
:根据报错信息,对代码进行以下修改:
修改后再次尝试跑inference
报错
CUDA error(700)
:用gdb调试可以看到问题出在
paddle::operators::MultiHeadMatMulV2Kernel
里,详见下面gdb backtrace信息:查看详情
请问此问题应该如何解决?谢谢!