facebookarchive / caffe2

Caffe2 is a lightweight, modular, and scalable deep learning framework.
https://caffe2.ai
Apache License 2.0
8.42k stars 1.94k forks source link

Operator with engine CUDNN is not available for operator Relu #2130

Open dreamk73 opened 6 years ago

dreamk73 commented 6 years ago

caffe2 build seems to have gone ok. But when I run the test: python -m caffe2.python.operator_test.relu_op_test

I get errors saying:

Operator with engine CUDNN is not available for operator Relu Operator with engine CUDNN is not available for operator ReluGradient ...

The build can find cuDNN and I have the correct path set in my LD_LIRBRARY_PATH. Any other ideas of why I would get these errors?

System information

CMake summary output

-- Does not need to define long separately. -- Current compiler supports avx2 extention. Will build perfkernels. -- Caffe2: Found protobuf with old-style protobuf targets. -- Caffe2 protobuf include directory: /usr/include -- The BLAS backend of choice:Eigen -- Found NNPACK (include: /usr/local/include, library: /usr/local/lib64/libnnpack.a) -- Found PTHREADPOOL (library: /usr/local/lib64/libpthreadpool.a) -- Found CPUINFO (library: /usr/local/lib64/libcpuinfo.a) INFOFound external NNPACK installation. -- Caffe2: Found gflags with new-style gflags target. -- Caffe2: Found glog with new-style glog target. -- Found PythonInterp: /data/miniconda/bin/python (found version "2.7.14") -- git Version: v0.0.0-dirty -- Version: 0.0.0 -- Performing Test HAVE_STD_REGEX -- Performing Test HAVE_STD_REGEX -- Performing Test HAVE_STD_REGEX -- compiled but failed to run -- Performing Test HAVE_GNU_POSIX_REGEX -- Performing Test HAVE_GNU_POSIX_REGEX -- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile -- Performing Test HAVE_POSIX_REGEX -- Performing Test HAVE_POSIX_REGEX -- Performing Test HAVE_POSIX_REGEX -- success -- Performing Test HAVE_STEADY_CLOCK -- Performing Test HAVE_STEADY_CLOCK -- Performing Test HAVE_STEADY_CLOCK -- success -- Found lmdb (include: /usr/include, library: /usr/lib64/liblmdb.so) -- Found LevelDB (include: /usr/include, library: /usr/lib64/libleveldb.so) -- Found Snappy (include: /usr/include, library: /usr/lib64/libsnappy.so) CMake Warning at cmake/Dependencies.cmake:222 (message): Not compiling with OpenCV. Suppress this warning with -DUSE_OPENCV=OFF Call Stack (most recent call first): CMakeLists.txt:101 (include)

CMake Warning at cmake/Dependencies.cmake:243 (find_package): By not providing "FindEigen3.cmake" in CMAKE_MODULE_PATH this project has asked CMake to find a package configuration file provided by "Eigen3", but CMake did not find one.

Could not find a package configuration file provided by "Eigen3" with any of the following names:

Eigen3Config.cmake
eigen3-config.cmake

Add the installation prefix of "Eigen3" to CMAKE_PREFIX_PATH or set "Eigen3_DIR" to a directory containing one of the above files. If "Eigen3" provides a separate development package or SDK, be sure it has been installed. Call Stack (most recent call first): CMakeLists.txt:101 (include)

-- Did not find system Eigen. Using third party subdirectory. -- Found PythonInterp: /data/miniconda/bin/python (found suitable version "2.7.14", minimum required is "2.7") -- NumPy ver. 1.14.0 found (include: /data/miniconda/lib/python2.7/site-packages/numpy/core/include) -- Could NOT find pybind11 (missing: pybind11_INCLUDE_DIR) -- Could NOT find MPI_C (missing: MPI_C_LIBRARIES MPI_C_INCLUDE_PATH) -- Could NOT find MPI_CXX (missing: MPI_CXX_LIBRARIES MPI_CXX_INCLUDE_PATH) CMake Warning at cmake/Dependencies.cmake:302 (message): Not compiling with MPI. Suppress this warning with -DUSE_MPI=OFF Call Stack (most recent call first): CMakeLists.txt:101 (include)

-- Caffe2: CUDA detected: 9.0 -- Found cuDNN: v7.0.5 (include: /usr/local/cuda/include, library: /usr/local/cuda/lib64/libcudnn.so) -- Automatic GPU detection returned 6.1 6.1. -- Added CUDA NVCC flags for: sm_61 -- Could NOT find NCCL (missing: NCCL_INCLUDE_DIRS NCCL_LIBRARIES) -- Could NOT find CUB (missing: CUB_INCLUDE_DIR) -- Could NOT find Gloo (missing: Gloo_INCLUDE_DIR Gloo_LIBRARY) -- CUDA detected: 9.0 -- Found libcuda: /usr/lib64/libcuda.so -- Found libnvrtc: /usr/local/cuda/lib64/libnvrtc.so CMake Warning at cmake/Dependencies.cmake:430 (message): mobile opengl is only used in android or ios builds. Call Stack (most recent call first): CMakeLists.txt:101 (include)

CMake Warning at cmake/Dependencies.cmake:506 (message): Metal is only used in ios builds. Call Stack (most recent call first): CMakeLists.txt:101 (include)

-- GCC 4.8.5: Adding gcc and gcc_s libs to link line -- Include NCCL operators -- Excluding image processing operators due to no opencv -- Excluding video processing operators due to no opencv -- Excluding mkl operators as we are not using mkl -- MPI operators skipped due to no MPI support -- Include Observer library -- Using lib/python2.7/site-packages as python relative installation path -- Automatically generating missing init.py files.

-- **** Summary **** -- General: -- CMake version : 3.6.3 -- CMake command : /usr/bin/cmake3 -- Git version : v0.8.1-1237-gbaf8cdb-dirty -- System : Linux -- C++ compiler : /usr/bin/c++ -- C++ compiler version : 4.8.5 -- Protobuf compiler : /usr/bin/protoc -- Protobuf include path : /usr/include -- Protobuf libraries : /usr/lib64/libprotobuf.so;-pthread -- BLAS : Eigen -- CXX flags : -std=c++11 -O2 -fPIC -Wno-narrowing -Wno-invalid-partial-specialization -- Build type : Release -- Compile definitions :

-- BUILD_BINARY : ON -- BUILD_DOCS : OFF -- BUILD_PYTHON : ON -- Python version : 2.7.14 -- Python library : /data/miniconda/lib/libpython2.7.so -- BUILD_SHARED_LIBS : ON -- BUILD_TEST : ON -- USE_ATEN : OFF -- USE_ASAN : OFF -- USE_CUDA : ON -- CUDA version : 9.0 -- CuDNN version : 7.0.5 -- CUDA root directory : /usr/local/cuda -- CUDA library : /usr/lib64/libcuda.so -- CUDA NVRTC library : /usr/local/cuda/lib64/libnvrtc.so -- CUDA runtime library: /usr/local/cuda/lib64/libcudart.so -- CUDA include path : /usr/local/cuda/include -- NVCC executable : /usr/local/cuda/bin/nvcc -- CUDA host compiler : /usr/bin/cc -- USE_EIGEN_FOR_BLAS : 1 -- USE_FFMPEG : OFF -- USE_GFLAGS : ON -- USE_GLOG : ON -- USE_GLOO : ON -- USE_LEVELDB : ON -- LevelDB version : 1.12 -- Snappy version : 1.1.0 -- USE_LITE_PROTO : OFF -- USE_LMDB : ON -- LMDB version : 0.9.18 -- USE_METAL : OFF -- USE_MKL : -- USE_MOBILE_OPENGL : OFF -- USE_MPI : OFF -- USE_NCCL : ON -- USE_NERVANA_GPU : OFF -- USE_NNPACK : ON -- USE_OBSERVERS : ON -- USE_OPENCV : OFF -- USE_OPENMP : OFF -- USE_PROF : OFF -- USE_REDIS : OFF -- USE_ROCKSDB : OFF -- USE_THREADS : ON -- USE_ZMQ : OFF -- Configuring done -- Generating done -- Build files have been written to: /data/root/setup_programs/caffe2/build

yuzcccc commented 6 years ago

I met the same problem

dreamk73 commented 6 years ago

I think the problem is that the cmake doesn't find the third_party modules even though they are there. I am not sure where to add them to a CMakeList or that sort of file?

bddppq commented 6 years ago

This is not an error message. The reason you see these is because by default for every cuda operator we first try to see whether a cudnn implementation exists (which supposedly is more performant), and if no then fallback to the default implementation. In this case Relu and ReluGradient ops don't have cudnn implementation and so the cudnn version lookup is expected to fail. Recently we have changed to print these messages for better debuggability, but looks like it has caused some confusions.

@jspark1105 I think we should change to print the warning/log only when the engine is explicitly required in the OperatorDef and not found.

jspark1105 commented 6 years ago

Got it. This is a good point. I'll take a look.

dreamk73 commented 6 years ago

Ok, that is good to know. It still doesn't solve the problem that when I run cmake3 .. in the build directory it says it can't find nccl, gloo, eigen, etc even though they are present in the third_party directory. I downloaded a fresh copy of caffe2 with the --recursive flag.

huangh12 commented 6 years ago

Hello, I compiled Caffe2 with Cuda8.0 and Cudnn6.0.5 successfully. But when I run mnist.ipynb in the tutorial directory. It prints out Operator with engine CUDNN is not available for operator Conv. It's quite weird since I have compiled Caffe2 with Cudnn. Should the basic conv operator still have no support from CUDNN?

himankmaan commented 6 years ago

@dreamk73 download all of them individually and place them in the third party folder

dreamk73 commented 6 years ago

The problem is not that the files are not downloaded. They are present in the third_party directory already. The problem is that when I run cmake it doesn't include these files or says it can't find them.

himankmaan commented 6 years ago

Show the error

Regards Himank Maan

On 16-Mar-2018 12:59 PM, "Esther Judd-Klabbers" notifications@github.com wrote:

The problem is not that the files are not downloaded. They are present in the third_party directory already. The problem is that when I run cmake it doesn't include these files or says it can't find them.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/caffe2/caffe2/issues/2130#issuecomment-373626977, or mute the thread https://github.com/notifications/unsubscribe-auth/AZZTzMlBAXjvW1GsDEZldo9BXFOvF4mqks5te2nbgaJpZM4SbwVW .

rafikg commented 5 years ago

@bddppq You mean that when we get this warning for example Engine CUDNN is not available for operator MaxPool, it means that CUDNN does not have MaxPool operator ?

CarlosYeverino commented 5 years ago

Hi @huangh12 ,

were you able to use engine CUDNN for the operator Conv?