PaddlePaddle / models

Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.
Apache License 2.0
6.9k stars 2.91k forks source link

想要训练tensorflow模型,但是却报了paddle的错 #4679

Open ghost opened 4 years ago

ghost commented 4 years ago

在训练tensorflow nonlocal的模型时加载了paddle nonlocal的imageNet上的预训练模型。遇到了如下报错:

2020-06-02 11:34:12.696932: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this Te nsorFlow binary was not compiled to use: AVX2 AVX512F FMA 2020-06-02 11:34:13.135866: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties: name: Tesla V100-SXM2-32GB major: 7 minor: 0 memoryClockRate(GHz): 1.53 pciBusID: 0000:3e:00.0 totalMemory: 31.72GiB freeMemory: 26.64GiB 2020-06-02 11:34:13.135905: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0 2020-06-02 11:34:13.989357: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-06-02 11:34:13.989401: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0 2020-06-02 11:34:13.989410: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N 2020-06-02 11:34:13.991195: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localho st/replica:0/task:0/device:GPU:0 with 25848 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0 000:3e:00.0, compute capability: 7.0) run ini time is 1.33219480515 2020-06-02 11:34:35.715417: E tensorflow/stream_executor/cuda/cuda_dnn.cc:396] Loaded runtime CuDNN library: 7605 (compatibi lity version 7600) but source was compiled with 7005 (compatibility version 7000). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version spec ified during compile configuration. 2020-06-02 11:34:35.716288: F tensorflow/core/kernels/conv_ops_3d.cc:399] Check failed: stream->parent()->GetConvolveAlgorit hms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms) W0602 11:34:35.716351 87921 init.cc:209] Warning: PaddlePaddle catches a failure signal, it may not work properly W0602 11:34:35.716372 87921 init.cc:211] You could check whether you killed PaddlePaddle thread/process accidentally or repo rt the case to PaddlePaddle W0602 11:34:35.716377 87921 init.cc:214] The detail failure signal is:

W0602 11:34:35.716383 87921 init.cc:217] Aborted at 1591068875 (unix time) try "date -d @1591068875" if you are using GN U date 2020-06-02 11:34:35.718102: E tensorflow/stream_executor/cuda/cuda_dnn.cc:396] Loaded runtime CuDNN library: 7605 (compatibi lity version 7600) but source was compiled with 7005 (compatibility version 7000). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version spec ified during compile configuration. 2020-06-02 11:34:35.718838: F tensorflow/core/kernels/conv_ops_3d.cc:399] Check failed: stream->parent()->GetConvolveAlgorit hms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms) W0602 11:34:35.718925 87921 init.cc:217] PC: @ 0x0 (unknown) W0602 11:34:35.719225 87921 init.cc:217] SIGABRT (@0x155ce) received by PID 87502 (TID 0x7fd3cdffb700) from PID 87502; s tack trace: 2020-06-02 11:34:35.720662: E tensorflow/stream_executor/cuda/cuda_dnn.cc:396] Loaded runtime CuDNN library: 7605 (compatibi lity version 7600) but source was compiled with 7005 (compatibility version 7000). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version spec ified during compile configuration. W0602 11:34:35.721504 87921 init.cc:217] @ 0x7fd5c20f25e0 (unknown) 2020-06-02 11:34:35.722468: F tensorflow/core/kernels/conv_ops_3d.cc:399] Check failed: stream->parent()->GetConvolveAlgorit hms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms) W0602 11:34:35.723459 87921 init.cc:217] @ 0x7fd5c164b1f7 GI_raise 2020-06-02 11:34:35.725009: E tensorflow/stream_executor/cuda/cuda_dnn.cc:396] Loaded runtime CuDNN library: 7605 (compatibi lity version 7600) but source was compiled with 7005 (compatibility version 7000). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version spec ified during compile configuration. W0602 11:34:35.725256 87921 init.cc:217] @ 0x7fd5c164c8e8 GI_abort 2020-06-02 11:34:35.726091: F tensorflow/core/kernels/conv_ops_3d.cc:399] Check failed: stream->parent()->GetConvolveAlgorit hms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms) 2020-06-02 11:34:35.728042: E tensorflow/stream_executor/cuda/cuda_dnn.cc:396] Loaded runtime CuDNN library: 7605 (compatibi lity version 7600) but source was compiled with 7005 (compatibility version 7000). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version spec ified during compile configuration. 2020-06-02 11:34:35.728924: F tensorflow/core/kernels/conv_ops_3d.cc:399] Check failed: stream->parent()->GetConvolveAlgorit hms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms) W0602 11:34:35.730231 87921 init.cc:217] @ 0x7fd4fbbcc374 tensorflow::internal::LogMessageFatal::~LogMessageFatal() 2020-06-02 11:34:35.730621: E tensorflow/stream_executor/cuda/cuda_dnn.cc:396] Loaded runtime CuDNN library: 7605 (compatibi lity version 7600) but source was compiled with 7005 (compatibility version 7000). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version spec ified during compile configuration. 2020-06-02 11:34:35.731292: F tensorflow/core/kernels/conv_ops_3d.cc:399] Check failed: stream->parent()->GetConvolveAlgorit hms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms) W0602 11:34:35.733728 87921 init.cc:217] @ 0x7fd4fb8cdc47 tensorflow::LaunchConvOp<>::launch() W0602 11:34:35.738135 87921 init.cc:217] @ 0x7fd4fb8ce019 tensorflow::Conv3DOp<>::Compute() W0602 11:34:35.739372 87921 init.cc:217] @ 0x7fd4f6e36289 tensorflow::BaseGPUDevice::ComputeHelper() W0602 11:34:35.740073 87921 init.cc:217] @ 0x7fd4f6e36750 tensorflow::BaseGPUDevice::Compute() W0602 11:34:35.740639 87921 init.cc:217] @ 0x7fd4f6e70365 tensorflow::(anonymous namespace)::ExecutorState::Process( ) W0602 11:34:35.741199 87921 init.cc:217] @ 0x7fd4f6e70b7a _ZNSt17_Function_handlerIFvvEZN10tensorflow12_GLOBALN_11 3ExecutorState13ScheduleReadyERKNS1_3gtl13InlinedVectorINS3_10TaggedNodeELi8EEEPNS3_20TaggedNodeReadyQueueEEUlvE_E9_M_in 49 vokeERKSt9_Any_data W0602 11:34:35.742094 87921 init.cc:217] @ 0x7fd4f6ae18ba Eigen::NonBlockingThreadPoolTempl<>::WorkerLoop() W0602 11:34:35.742883 87921 init.cc:217] @ 0x7fd4f6ae0962 _ZNSt17_Function_handlerIFvvEZN10tensorflow6thread16EigenE nvironment12CreateThreadESt8functionIS0_EEUlvE_E9_M_invokeERKSt9_Any_data W0602 11:34:35.743469 87921 init.cc:217] @ 0x7fd4ed435070 (unknown) W0602 11:34:35.744760 87921 init.cc:217] @ 0x7fd5c20eae25 start_thread W0602 11:34:35.746145 87921 init.cc:217] @ 0x7fd5c170e35d clone W0602 11:34:35.747447 87921 init.cc:217] @ 0x0 (unknown)

NHZlX commented 4 years ago

可否提供具体的复现方式