[Question]: 按文档跑GPT模型报错

请提出你的问题

按文档（model_zoo/gpt-3/docs/quick_start.md）跑GPT模型，依次执行如下命令：

docker pull registry.baidubce.com/ppfleetx/fleetx-cuda11.2-cudnn8:dev
docker run -it --name=paddle --net=host -v /dev/shm:/dev/shm --shm-size=32G -v $PWD:/paddle --runtime=nvidia registry.baidubce.com/ppfleetx/ppfleetx-cuda11.2-cudnn8:v0.1.0 bash

在docker环境内部

git clone https://github.com/PaddlePaddle/PaddleFleetX.git
cd PaddleFleetX
mkdir data
wget -O data/gpt_en_dataset_300m_ids.npy https://bj.bcebos.com/paddlenlp/models/transformers/gpt/data/gpt_en_dataset_300m_ids.npy
wget -O data/gpt_en_dataset_300m_idx.npz https://bj.bcebos.com/paddlenlp/models/transformers/gpt/data/gpt_en_dataset_300m_idx.npz

使用以下命令跑training，可以正常运行

python ./tools/train.py -c ./ppfleetx/configs/nlp/gpt/pretrain_gpt_345M_single_card.yaml

使用以下命令跑inference

python ./tools/inference.py -c ./ppfleetx/configs/nlp/gpt/inference_gpt_345M_single_card.yaml

报错找不到Tensor.copy：

Traceback (most recent call last):
  File "./tools/inference.py", line 53, in <module>
    outs = engine.inference(data)
  File "/paddle/PaddleFleetX/ppfleetx/core/engine/eager_engine.py", line 864, in inference
    return self._inference_engine.predict(data)
  File "/paddle/PaddleFleetX/ppfleetx/core/engine/inference_engine.py", line 260, in predict
    handle.copy_from_cpu(np.array(d.copy()))
AttributeError: 'Tensor' object has no attribute 'copy'

根据报错信息，对代码进行以下修改：

--- a/ppfleetx/core/engine/inference_engine.py
+++ b/ppfleetx/core/engine/inference_engine.py
@@ -257,7 +257,7 @@
                     raise ValueError()
                 for d, name in zip(data, self.input_names()):
                     handle = self.predictor.get_input_handle(name)
-                    handle.copy_from_cpu(np.array(d.copy()))
+                    handle.copy_from_cpu(np.array(d))
             elif isinstance(data, Mapping):
                 # key check
                 for k, v in data.items():

修改后再次尝试跑inference

python ./tools/inference.py -c ./ppfleetx/configs/nlp/gpt/inference_gpt_345M_single_card.yaml

报错CUDA error(700)：

Traceback (most recent call last):
  File "./tools/inference.py", line 53, in <module>
    outs = engine.inference(data)
  File "/paddle/PaddleFleetX/ppfleetx/core/engine/eager_engine.py", line 864, in inference
    return self._inference_engine.predict(data)
  File "/paddle/PaddleFleetX/ppfleetx/core/engine/inference_engine.py", line 269, in predict
    self.predictor.run()
OSError: (External) CUBLAS error(1).
  [Hint: Please search for the error code(1) on website (https://docs.nvidia.com/cuda/cublas/index.html#cublasstatus_t) to get Nvidia's official solution and advice about CUBLAS Error.] (at /paddle/paddle/paddle/phi/backends/gpu/gpu_resources.cc:185)
  [operator < multihead_matmul > error]
terminate called after throwing an instance of 'phi::enforce::EnforceNotMet'
  what():  (External) CUDA error(700), an illegal memory access was encountered.
  [Hint: Please search for the error code(700) on website (https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1g3f51e3575c2178246db0a94a430e0038) to get Nvidia's official solution and advice about CUDA Error.] (at /paddle/paddle/paddle/fluid/platform/device/gpu/gpu_info.cc:271)

Aborted (core dumped)

用gdb调试可以看到问题出在paddle::operators::MultiHeadMatMulV2Kernel里，详见下面gdb backtrace信息：

查看详情

Thread 1 "python" hit Catchpoint 1 (exception thrown), __cxxabiv1::__cxa_throw (obj=0x43039290,
    tinfo=0x7f6d6c1427a8 ,                                                              
    dest=0x7f6d4ad15a40 )                                                          
    at ../../../../gcc-8.2.0/libstdc++-v3/libsupc++/eh_throw.cc:80                                                                
80      ../../../../gcc-8.2.0/libstdc++-v3/libsupc++/eh_throw.cc: No such file or directory.                                      
(gdb) bt                                                                                                                          
#0  __cxxabiv1::__cxa_throw (obj=0x43039290, tinfo=0x7f6d6c1427a8 ,                     
    dest=0x7f6d4ad15a40 )                                                          
    at ../../../../gcc-8.2.0/libstdc++-v3/libsupc++/eh_throw.cc:80                                                                
#1  0x00007f6d4a360a5f in phi::InitBlasHandle(cublasContext**, CUstream_st*) [clone .cold.187] ()                                 
   from /usr/local/lib/python3.7/dist-packages/paddle/fluid/libpaddle.so                                                          
#2  0x00007f6d5434a828 in phi::GPUContext::Impl::CublasCall(std::function const&)::{lambda()#1}::operator()
() const () from /usr/local/lib/python3.7/dist-packages/paddle/fluid/libpaddle.so                                                 
#3  0x00007f6dabbc6907 in __pthread_once_slow (once_control=0x75b6cb4,                                                            
    init_routine=0x7f6d9ca001e0 ) at pthread_once.c:116                                                      
#4  0x00007f6d543441c4 in phi::GPUContext::CublasCall(std::function const&) const ()                       
   from /usr/local/lib/python3.7/dist-packages/paddle/fluid/libpaddle.so                                                          
#5  0x00007f6d4c2d84ae in void phi::funcs::Blas::GEMM(CBLAS_TRANSPOSE, CBLAS_TRANSPOSE, int, int, int, flo
at, float const*, float const*, float, float*) const ()                                                                           
   from /usr/local/lib/python3.7/dist-packages/paddle/fluid/libpaddle.so                                                          
#6  0x00007f6d4c2d8e58 in void phi::funcs::Blas::MatMul(phi::DenseTensor const&, bool, phi::DenseTensor co
nst&, bool, float, phi::DenseTensor*, float) const ()                                                                             
   from /usr/local/lib/python3.7/dist-packages/paddle/fluid/libpaddle.so                                                          
#7  0x00007f6d4db19a92 in paddle::operators::MultiHeadMatMulV2Kernel::Compute(paddle::framework::Execution
Context const&) const [clone .constprop.1064] ()                                                                                  
   from /usr/local/lib/python3.7/dist-packages/paddle/fluid/libpaddle.so                                                          
#8  0x00007f6d4db1a264 in std::_Function_handler, paddle:
:operators::MultiHeadMatMulV2Kernel >::operator()(char const*, char const*, int) const::{lambda(paddle::fr
amework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&) () from /usr/l
ocal/lib/python3.7/dist-packages/paddle/fluid/libpaddle.so                                                                        
#9  0x00007f6d4f5ac399 in paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, phi::Place const&, paddl
e::framework::RuntimeContext*) const () from /usr/local/lib/python3.7/dist-packages/paddle/fluid/libpaddle.so                     
#10 0x00007f6d4f5ae3f4 in paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, phi::Place const&) const () from /usr/local/lib/python3.7/dist-packages/paddle/fluid/libpaddle.so                                               
#11 0x00007f6d4f594aba in paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, phi::Place const&) ()             
   from /usr/local/lib/python3.7/dist-packages/paddle/fluid/libpaddle.so                                                          
#12 0x00007f6d4ef3ec4d in paddle::framework::NaiveExecutor::Run() ()                                                              
   from /usr/local/lib/python3.7/dist-packages/paddle/fluid/libpaddle.so                                                          
#13 0x00007f6d4b4b952b in paddle::AnalysisPredictor::ZeroCopyRun() ()                                                             
   from /usr/local/lib/python3.7/dist-packages/paddle/fluid/libpaddle.so                                                          
#14 0x00007f6d4b07f810 in void pybind11::cpp_function::initialize(paddle::pybind::(anonymous namespace)::BindPaddleInferPredictor(pybind11::module_*)::{lambda(
paddle_infer::Predictor&)#2}&&, void (*)(paddle_infer::Predictor&), pybind11::name const&, pybind11::is_method const&, pybind11::s
ibling const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call)                               
    () from /usr/local/lib/python3.7/dist-packages/paddle/fluid/libpaddle.so                                                      
#15 0x00007f6d4ad2c333 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) ()                                     
   from /usr/local/lib/python3.7/dist-packages/paddle/fluid/libpaddle.so                                                          
#16 0x0000000000593784 in _PyMethodDef_RawFastCallKeywords ()                                                                     
#17 0x0000000000594731 in _PyObject_FastCallKeywords ()          
#18 0x0000000000548cc1 in ?? ()                                  
#19 0x000000000051566f in _PyEval_EvalFrameDefault ()            
#20 0x0000000000549e0e in _PyEval_EvalCodeWithName ()            
#21 0x0000000000593fce in _PyFunction_FastCallKeywords ()        
#22 0x0000000000511e2c in _PyEval_EvalFrameDefault ()            
#23 0x0000000000593dd7 in _PyFunction_FastCallKeywords ()        
#24 0x0000000000511e2c in _PyEval_EvalFrameDefault ()            
#25 0x0000000000549576 in _PyEval_EvalCodeWithName ()            
#26 0x0000000000604173 in PyEval_EvalCode ()                     
#27 0x00000000005f5506 in ?? ()                                  
#28 0x00000000005f8c6c in PyRun_FileExFlags ()

请问此问题应该如何解决？谢谢！

PaddlePaddle / PaddleNLP

[Question]: 按文档跑GPT模型报错 #6158

请提出你的问题