NVIDIA / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
672 stars 263 forks source link

Caffe complilation fails with NCCL errors #540

Closed supreetha-s closed 5 years ago

supreetha-s commented 5 years ago

System details: Ubuntu 16.0.4.5 CUDA 10 NLCC libnccl2 (2.3.4) Caffe 0.17.1

Caffe Complilation fails with following errors; Can you please help me in resolving below issue.

caffe-0.17.1$ make all -j4 CXX src/caffe/parallel.cpp CXX src/caffe/solvers/rmsprop_solver.cpp CXX src/caffe/solvers/adam_solver.cpp CXX src/caffe/solvers/adadelta_solver.cpp In file included from src/caffe/parallel.cpp:2:0: src/caffe/parallel.cpp: In constructor ‘caffe::P2PManager::P2PManager(boost::shared_ptr, int, int, const caffe::SolverParameter&)’: src/caffe/parallel.cpp:11:25: error: ‘NCCL_MAJOR’ was not declared in this scope

define CAFFE_NCCL_VER (NCCL_MAJOR10000 + NCCL_MINOR100)

                     ^

src/caffe/parallel.cpp:61:17: note: in expansion of macro ‘CAFFE_NCCL_VER’ LOG_IF(FATAL, CAFFE_NCCL_VER < 20200) << "NCCL 2.2 or higher is required"; ^ src/caffe/parallel.cpp:11:44: error: ‘NCCL_MINOR’ was not declared in this scope

define CAFFE_NCCL_VER (NCCL_MAJOR10000 + NCCL_MINOR100)

                                        ^

src/caffe/parallel.cpp:61:17: note: in expansion of macro ‘CAFFE_NCCL_VER’ LOG_IF(FATAL, CAFFE_NCCL_VER < 20200) << "NCCL 2.2 or higher is required"; ^ In file included from ./include/caffe/parallel.hpp:23:0, from ./include/caffe/caffe.hpp:13, from src/caffe/parallel.cpp:6: src/caffe/parallel.cpp: In member function ‘virtual void caffe::P2PSync::on_start(const std::vector<boost::shared_ptr >&)’: src/caffe/parallel.cpp:248:31: error: ‘ncclGroupStart’ was not declared in this scope NCCL_CHECK(ncclGroupStart()); ^ ./include/caffe/util/nccl.hpp:10:25: note: in definition of macro ‘NCCL_CHECK’ ncclResult_t result = condition; \ ^ src/caffe/parallel.cpp:255:29: error: ‘ncclGroupEnd’ was not declared in this scope NCCL_CHECK(ncclGroupEnd()); ^ ./include/caffe/util/nccl.hpp:10:25: note: in definition of macro ‘NCCL_CHECK’ ncclResult_t result = condition; \ ^ Makefile:610: recipe for target '.build_release/src/caffe/parallel.o' failed make: [.build_release/src/caffe/parallel.o] Error 1 make: Waiting for unfinished jobs....

drnikolaev commented 5 years ago

@supreetha-s please use CMake while we fixing this error. Thank you.

supreetha-s commented 5 years ago

Thank you, will try with that and update.

dsgh2 commented 5 years ago

Hi. I have same errors. Could let me know how to solve? Cmake also does not work with same reasons.

drnikolaev commented 5 years ago

Fixed in v0.17.2