rioyokotalab / caffe2

Caffe2 is a lightweight, modular, and scalable deep learning framework.
https://caffe2.ai
Other
2 stars 0 forks source link

Caffe2 Setup #16

Closed Hiroki11x closed 6 years ago

Hiroki11x commented 6 years ago

https://github.com/rioyokotalab/caffe2/wiki/Caffe2-build-on-ReedBush

I tried

make install -j 128
libCaffe2_CPU.so: undefined reference to `google::protobuf::internal::WireFormat::ReadPackedEnumPreserveUnknowns(google::protobuf::io::CodedInputStream*, unsigned int, bool (*)(int), google::protobuf::UnknownFieldSet*, google::protobuf::RepeatedField<int>*)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::io::CodedInputStream::IncrementRecursionDepthAndPushLimit(int)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::internal::WireFormatLite::Int32Size(google::protobuf::RepeatedField<int> const&)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::io::CodedInputStream::ReadVarint32Fallback(unsigned int)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::internal::WireFormatLite::WriteBytesMaybeAliased(int, std::string const&, google::protobuf::io::CodedOutputStream*)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::io::CodedOutputStream::WriteVarint64SlowPath(unsigned long)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::internal::RegisterAllTypes(google::protobuf::Metadata const*, int)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::io::CodedOutputStream::WriteVarint32SlowPath(unsigned int)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::internal::InitProtobufDefaults()'
libCaffe2_CPU.so: undefined reference to `google::protobuf::Message::SpaceUsedLong() const'
libCaffe2_CPU.so: undefined reference to `google::protobuf::internal::WireFormatLite::WriteDoubleArray(double const*, int, google::protobuf::io::CodedOutputStream*)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::io::CodedInputStream::BytesUntilTotalBytesLimit() const'
libCaffe2_CPU.so: undefined reference to `google::protobuf::io::CodedInputStream::ReadVarintSizeAsIntFallback()'
libCaffe2_CPU.so: undefined reference to `google::protobuf::io::CodedInputStream::ReadTagFallback(unsigned int)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::internal::RepeatedPtrFieldBase::InternalExtend(int)'
collect2: error: ld returned 1 exit status
make[2]: *** [caffe2/binaries/blob_test] Error 1
make[1]: *** [caffe2/CMakeFiles/blob_test.dir/all] Error 2
Linking CXX shared module python/caffe2_pybind11_state_gpu.so
Linking CXX shared module python/caffe2_pybind11_state.so
[100%] Built target caffe2_pybind11_state_gpu
[100%] Built target caffe2_pybind11_state
make: *** [all] Error 2

$ pyenv --version
pyenv 1.1.3

$ pyenv versions
  system
* 2.7.10 (set by /lustre/gi75/i75012/env/src/pyenv/version)
  3.4.3
  3.5.0

So , I'll try to change python3 then

$ pip install protobuf
Hiroki11x commented 6 years ago

[ 99%] [100%] [100%] [100%] [100%] Building CXX object caffe2/CMakeFiles/caffe2_pybind11_state.dir/python/pybind_state.cc.o
Building CXX object caffe2/CMakeFiles/caffe2_pybind11_state.dir/python/pybind_state_mkl.cc.o
Building CXX object caffe2/CMakeFiles/caffe2_pybind11_state_gpu.dir/python/pybind_state.cc.o
Building CXX object caffe2/CMakeFiles/caffe2_pybind11_state_gpu.dir/python/pybind_state_gpu.cc.o
Building CXX object caffe2/CMakeFiles/caffe2_pybind11_state_gpu.dir/python/pybind_state_mkl.cc.o
Linking CXX executable binaries/cpuid_test
Linking CXX executable binaries/fixed_divisor_test
Linking CXX executable binaries/timer_test
Linking CXX executable binaries/fatal_signal_asan_no_sig_test
libCaffe2_CPU.so: undefined reference to `google::protobuf::internal::WireFormatLite::WriteStringMaybeAliased(int, std::string const&, google::protobuf::io::CodedOutputStream*)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::io::CodedOutputStream::WriteStringWithSizeToArray(std::string const&, unsigned char*)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::io::CodedInputStream::DecrementRecursionDepthAndPopLimit(int)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::internal::WireFormatLite::UInt32Size(google::protobuf::RepeatedField<unsigned int> const&)'

~~
~~

libCaffe2_CPU.so: undefined reference to `google::protobuf::io::CodedInputStream::IncrementRecursionDepthAndPushLimit(int)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::internal::WireFormatLite::Int32Size(google::protobuf::RepeatedField<int> const&)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::io::CodedInputStream::ReadVarint32Fallback(unsigned int)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::internal::WireFormatLite::WriteBytesMaybeAliased(int, std::string const&, google::protobuf::io::CodedOutputStream*)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::io::CodedOutputStream::WriteVarint64SlowPath(unsigned long)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::internal::RegisterAllTypes(google::protobuf::Metadata const*, int)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::io::CodedOutputStream::WriteVarint32SlowPath(unsigned int)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::internal::InitProtobufDefaults()'
libCaffe2_CPU.so: undefined reference to `google::protobuf::Message::SpaceUsedLong() const'
libCaffe2_CPU.so: undefined reference to `google::protobuf::internal::WireFormatLite::WriteDoubleArray(double const*, int, google::protobuf::io::CodedOutputStream*)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::io::CodedInputStream::BytesUntilTotalBytesLimit() const'
libCaffe2_CPU.so: undefined reference to `google::protobuf::io::CodedInputStream::ReadVarintSizeAsIntFallback()'
libCaffe2_CPU.so: undefined reference to `google::protobuf::io::CodedInputStream::ReadTagFallback(unsigned int)'
libCaffe2_CPU.so: undefined reference to `google::protobuf::internal::RepeatedPtrFieldBase::InternalExtend(int)'
collect2: error: ld returned 1 exit status
make[2]: *** [caffe2/binaries/blob_test] Error 1
make[1]: *** [caffe2/CMakeFiles/blob_test.dir/all] Error 2
Linking CXX shared module python/caffe2_pybind11_state.so
[100%] Built target caffe2_pybind11_state
Linking CXX shared module python/caffe2_pybind11_state_gpu.so
[100%] Built target caffe2_pybind11_state_gpu
make: *** [all] Error 2

have to use /lustre/gi75/i75012/env/local/protobuf-3.3.0

Hiroki11x commented 6 years ago

create script

mkdir build
cd build
LOCAL=/lustre/gi75/i75012/env/local
PYTHON_INCLUDE=/usr/include/python2.7:/lustre/gi75/i75012/env/src/pyenv/versions/2.7.10/lib/python2.7/site-packages/numpy/core/include
PYTHON_LIB=/usr/lib
HDF5_HL_LIBRARIES=/lustre/gi75/i75012/env/local/hdf5-1.10.0-patch1/lib

declare -a PACKAGES=(\
'boost_1_63_0' \
'gflags-2.2.0' \
'glog-0.3.4' \
'hdf5-1.10.0-patch1' \
'lmdb-LMDB_0.9.18' \
'protobuf-3.3.0' \
'cuda' \
'snappy-1.1.4' \
'opencv-2.4.13' \
'nccl-1.3.4-1' \
'ATLAS' \
)

PREFIX_PATH="/lustre/gi75/i75012/env/local/cudnn7/cuda:/usr/local/cuda"

for s in "${PACKAGES[@]}"; do
PREFIX_PATH=$PREFIX_PATH:$LOCAL/$s
done

CMAKE_PREFIX_PATH=$PYTHON_INCLUDE:$PYTHON_LIB:$HDF5_HL_LIBRARIES:$PREFIX_PATH cmake \
-DCUDA_TOOLKIT_ROOT_DIR=/lustre/app/acc/cuda/8.0 \
-DCMAKE_INSTALL_PREFIX=/lustre/gi75/i75012/dl/caffe2/local \
-DUSE_NCCL=ON \
-DUSE_LEVELDB=OFF \
.. | tee configure.log

make all -j 248 && make test -j 248 && make install
Hiroki11x commented 6 years ago
-- Include NCCL operators
CMake Error at caffe2/contrib/CMakeLists.txt:6 (add_subdirectory):
  The source directory

    /home/gi75/i75012/dl/caffe2/caffe2/contrib/opengl

  does not contain a CMakeLists.txt file.
Hiroki11x commented 6 years ago
pip install numpy
pip install future
pip install protobuf

git clone https://github.com/caffe2/caffe2.git && cd caffe2
git submodule update --init

rm -rf build && mkdir build && cd build

LOCAL=/lustre/gi75/i75012/env/local
PYTHON_INCLUDE=/usr/include/python2.7:/lustre/gi75/i75012/env/src/pyenv/versions/2.7.10/lib/python2.7/site-packages/numpy/core/include
PYTHON_LIB=/usr/lib
HDF5_HL_LIBRARIES=/lustre/gi75/i75012/env/local/hdf5-1.10.0-patch1/lib

declare -a PACKAGES=(\
'boost_1_63_0' \
'gflags-2.2.0' \
'glog-0.3.4' \
'hdf5-1.10.0-patch1' \
'lmdb-LMDB_0.9.18' \
'protobuf-3.3.0' \
'cuda' \
'snappy-1.1.4' \
'opencv-2.4.13' \
'nccl-1.3.4-1' \
'ATLAS' \
)

PREFIX_PATH="/lustre/gi75/i75012/env/local/cudnn7/cuda:/usr/local/cuda"

for s in "${PACKAGES[@]}"; do
PREFIX_PATH=$PREFIX_PATH:$LOCAL/$s
done

CMAKE_PREFIX_PATH=$PYTHON_INCLUDE:$PYTHON_LIB:$HDF5_HL_LIBRARIES:$PREFIX_PATH cmake \
-DCUDA_TOOLKIT_ROOT_DIR=/lustre/app/acc/cuda/8.0 \
-DCMAKE_INSTALL_PREFIX=/lustre/gi75/i75012/dl/caffe2/local \
-DBLAS=Eigen \
-DUSE_CUDA=ON \
-DUSE_ROCKSDB=OFF \
-DUSE_GLOO=ON \
-DUSE_REDIS=ON \
-DUSE_OPENCV=ON \
-DUSE_GFLAGS=OFF \
.. | tee configure.log

make all -j 248 && make test -j 248 && make install
Hiroki11x commented 6 years ago
-- Found libnvrtc: /lustre/app/acc/cuda/8.0.44/lib64/libnvrtc.so
CMake Warning at cmake/Dependencies.cmake:391 (message):
  mobile opengl is only used in android or ios builds.
Call Stack (most recent call first):
  CMakeLists.txt:74 (include)

-- Performing Test CAFFE2_LONG_IS_INT32_OR_64
-- Performing Test CAFFE2_LONG_IS_INT32_OR_64 - Success
-- Does not need to define long separately.
-- Performing Test CAFFE2_NEED_TO_TURN_OFF_DEPRECATION_WARNING
-- Performing Test CAFFE2_NEED_TO_TURN_OFF_DEPRECATION_WARNING - Success
-- Performing Test CAFFE2_COMPILER_SUPPORTS_AVX2_EXTENSIONS
-- Performing Test CAFFE2_COMPILER_SUPPORTS_AVX2_EXTENSIONS - Success
-- Current compiler supports avx2 extention. Will build perfkernels.
-- GCC 4.8.5: Adding gcc and gcc_s libs to link line
-- Include NCCL operators
CMake Error at caffe2/contrib/CMakeLists.txt:6 (add_subdirectory):
  The source directory

    /lustre/gi75/i75012/dl/caffe2/caffe2/contrib/opengl

  does not contain a CMakeLists.txt file.

-- Including image processing operators
-- Excluding video processing operators due to no opencv
-- Excluding mkl operators as we are not using mkl
-- Automatically generating missing __init__.py files.
-- 
-- ******** Summary ********
-- General:
--   Git version           : 
--   System                : Linux
--   C++ compiler          : /usr/bin/c++
--   C++ compiler version  : 4.8.5
--   Protobuf compiler     : /lustre/gi75/i75012/env/local/protobuf-3.3.0/bin/protoc
--   CXX flags             :  -fopenmp -std=c++11 -O2 -fPIC -Wno-narrowing
--   Build type            : Release
--   Compile definitions   : CAFFE2_USE_EIGEN_FOR_BLAS;CAFFE2_USE_GOOGLE_GLOG;EIGEN_MPL2_ONLY;CAFFE2_PERF_WITH_AVX;CAFFE2_PERF_WITH_AVX2
-- 
--   BUILD_SHARED_LIBS     : ON
--   BUILD_PYTHON          : ON
--     Python version      : 2.7.5
--     Python library      : /usr/lib64/libpython2.7.so
--   BUILD_TEST            : ON
--   USE_CUDA              : ON
--     CUDA version        : 8.0
--   USE_CNMEM             : OFF
--   USE_NERVANA_GPU       : OFF
--   USE_GLOG              : ON
--   USE_GFLAGS            : OFF
--   USE_LMDB              : ON
--     LMDB version        : 0.9.18
--   USE_LEVELDB           : OFF
--   USE_OPENCV            : ON
--     OpenCV version      : 2.4.13
--   USE_FFMPEG            : 
--   USE_ZMQ               : OFF
--   USE_ROCKSDB           : OFF
--   USE_MPI               : ON
--   USE_NCCL              : ON
--   USE_NNPACK            : OFF
--   USE_OPENMP            : ON
--   USE_REDIS             : ON
--   USE_GLOO              : ON
-- Configuring incomplete, errors occurred!
See also "/lustre/gi75/i75012/dl/caffe2/build/CMakeFiles/CMakeOutput.log".
See also "/lustre/gi75/i75012/dl/caffe2/build/CMakeFiles/CMakeError.log".
make: *** No rule to make target `all'.  Stop.
Hiroki11x commented 6 years ago

It depend caffe2 version update. I'll try to build rioyokotalab/caffe2. it is a stable version.

/lustre/gi75/i75012/dl/caffe2/caffe2/operators/resize_op.cu(63): error: identifier "__ldg" is undefined

1 error detected in the compilation of "/tmp/tmpxft_0000430d_00000000-20_resize_op.compute_20.cpp1.ii".
CMake Error at Caffe2_GPU_generated_resize_op.cu.o.cmake:260 (message):
  Error generating file
  /lustre/gi75/i75012/dl/caffe2/build/caffe2/CMakeFiles/Caffe2_GPU.dir/operators/./Caffe2_GPU_generated_resize_op.cu.o

the previous problem occured.

https://github.com/rioyokotalab/caffe2/issues/9

rioyokotalab/caffe2 is not a stable version.

Hiroki11x commented 6 years ago

https://github.com/rioyokotalab/caffe2/commit/3a2e09674920fa9ac124a4facd6ef90e4eea1b47 is stable version for distributed training