facebookresearch / DensePose

A real-time approach for mapping all human pixels of 2D RGB images to a 3D surface-based model of the body
http://densepose.org
Other
6.95k stars 1.29k forks source link

Installation problem #150

Open rafikg opened 5 years ago

rafikg commented 5 years ago

I am running this code on: anaconda 3 Ubuntu 18.04

I start by creating and activate a conda environment densepose and under this environment I install caffe2. I pass all test until make ops I added these lines in the cmakelists.txt fille:

set(Caffe2_DIR "/home/rafikg/./anaconda3/envs/densepose/lib/python2.7/site-packages/torch/share/cmake/Caffe2/")
set(CUDA_TOOLKIT_ROOT_DIR "/home/rafikg/anaconda3/envs/densepose/include")
set(CUDNN_ROOT_DIR "/home/rafikg/anaconda3/envs/densepose/lib")
set(CUDA_HOST_COMPILER "/usr/bin/gcc-5")

mkdir -p build && cd build && cmake .. && make -j12 -- Caffe2: CUDA detected: 9.1 -- Caffe2: CUDA nvcc is: /usr/bin/nvcc -- Caffe2: CUDA toolkit directory: /home/rafikg/anaconda3/envs/densepose/include -- Caffe2: Header version is: 9.1 -- Found cuDNN: v7.1.3 (include: /home/rafikg/anaconda3/envs/densepose/include, library: /home/rafikg/anaconda3/envs/densepose/lib/libcudnn.so) -- Autodetected CUDA architecture(s): 6.1 -- Added CUDA NVCC flags for: -gencode;arch=compute_61,code=sm_61 -- Summary: -- CMake version : 3.12.2 -- CMake command : /home/rafikg/anaconda3/envs/densepose/bin/cmake -- System name : Linux -- C++ compiler : /usr/bin/c++ -- C++ compiler version : 7.3.0 -- CXX flags : -std=c++11 -O2 -fPIC -Wno-narrowing -- Caffe2 version : 1.0.0 -- Caffe2 include path : /home/rafikg/anaconda3/envs/densepose/lib/python2.7/site-packages/torch/lib/include -- Caffe2 found CUDA : True -- CUDA version : 9.1 -- CuDNN version : 7.1.3 -- Configuring done -- Generating done -- Build files have been written to: /home/rafikg/Documents/Bitbucket/DensePose/build make[1]: Entering directory '/home/rafikg/Documents/Bitbucket/DensePose/build' make[2]: Entering directory '/home/rafikg/Documents/Bitbucket/DensePose/build' make[3]: Entering directory '/home/rafikg/Documents/Bitbucket/DensePose/build' make[3]: Entering directory '/home/rafikg/Documents/Bitbucket/DensePose/build' [ 12%] Building NVCC (Device) object CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/caffe2_detectron_custom_ops_gpu_generated_zero_even_op.cu.o Scanning dependencies of target caffe2_detectron_custom_ops [ 25%] Building NVCC (Device) object CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/caffe2_detectron_custom_ops_gpu_generated_pool_points_interp.cu.o make[3]: Leaving directory '/home/rafikg/Documents/Bitbucket/DensePose/build' make[3]: Entering directory '/home/rafikg/Documents/Bitbucket/DensePose/build' [ 37%] Building CXX object CMakeFiles/caffe2_detectron_custom_ops.dir/detectron/ops/zero_even_op.cc.o [ 50%] Building CXX object CMakeFiles/caffe2_detectron_custom_ops.dir/detectron/ops/pool_points_interp.cc.o In file included from /home/rafikg/Documents/Bitbucket/DensePose/detectron/ops/zero_even_op.cc:9:0: /home/rafikg/Documents/Bitbucket/DensePose/detectron/ops/zero_even_op.h:12:10: fatal error: caffe2/core/context.h: No such file or directory

include "caffe2/core/context.h"

      ^~~~~~~~~~~~~~~~~~~~~~~

compilation terminated. In file included from /home/rafikg/Documents/Bitbucket/DensePose/detectron/ops/pool_points_interp.cc:10:0: /home/rafikg/Documents/Bitbucket/DensePose/detectron/ops/pool_points_interp.h:13:10: fatal error: caffe2/core/context.h: No such file or directory

include "caffe2/core/context.h"

      ^~~~~~~~~~~~~~~~~~~~~~~

compilation terminated. CMakeFiles/caffe2_detectron_custom_ops.dir/build.make:75: recipe for target 'CMakeFiles/caffe2_detectron_custom_ops.dir/detectron/ops/zero_even_op.cc.o' failed make[3]: [CMakeFiles/caffe2_detectron_custom_ops.dir/detectron/ops/zero_even_op.cc.o] Error 1 make[3]: Waiting for unfinished jobs.... CMakeFiles/caffe2_detectron_custom_ops.dir/build.make:62: recipe for target 'CMakeFiles/caffe2_detectron_custom_ops.dir/detectron/ops/pool_points_interp.cc.o' failed make[3]: [CMakeFiles/caffe2_detectron_custom_ops.dir/detectron/ops/pool_points_interp.cc.o] Error 1 make[3]: Leaving directory '/home/rafikg/Documents/Bitbucket/DensePose/build' CMakeFiles/Makefile2:72: recipe for target 'CMakeFiles/caffe2_detectron_custom_ops.dir/all' failed make[2]: [CMakeFiles/caffe2_detectron_custom_ops.dir/all] Error 2 make[2]: *** Waiting for unfinished jobs.... /home/rafikg/Documents/Bitbucket/DensePose/detectron/ops/zero_even_op.cu:9:37: fatal error: caffe2/core/context_gpu.h: No such file or directory compilation terminated. /home/rafikg/Documents/Bitbucket/DensePose/detectron/ops/pool_points_interp.cu:11:37: fatal error: caffe2/core/context_gpu.h: No such file or directory compilation terminated. CMake Error at caffe2_detectron_custom_ops_gpu_generated_zero_even_op.cu.o.cmake:219 (message): Error generating /home/rafikg/Documents/Bitbucket/DensePose/build/CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/./caffe2_detectron_custom_ops_gpu_generated_zero_even_op.cu.o

CMake Error at caffe2_detectron_custom_ops_gpu_generated_pool_points_interp.cu.o.cmake:219 (message): Error generating /home/rafikg/Documents/Bitbucket/DensePose/build/CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/./caffe2_detectron_custom_ops_gpu_generated_pool_points_interp.cu.o

CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/build.make:70: recipe for target 'CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/caffe2_detectron_custom_ops_gpu_generated_zero_even_op.cu.o' failed make[3]: [CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/caffe2_detectron_custom_ops_gpu_generated_zero_even_op.cu.o] Error 1 make[3]: Waiting for unfinished jobs.... CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/build.make:63: recipe for target 'CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/caffe2_detectron_custom_ops_gpu_generated_pool_points_interp.cu.o' failed make[3]: [CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/caffe2_detectron_custom_ops_gpu_generated_pool_points_interp.cu.o] Error 1 make[3]: Leaving directory '/home/rafikg/Documents/Bitbucket/DensePose/build' CMakeFiles/Makefile2:109: recipe for target 'CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/all' failed make[2]: [CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/all] Error 2 make[2]: Leaving directory '/home/rafikg/Documents/Bitbucket/DensePose/build' Makefile:129: recipe for target 'all' failed make[1]: [all] Error 2 make[1]: Leaving directory '/home/rafikg/Documents/Bitbucket/DensePose/build' Makefile:13: recipe for target 'ops' failed make: [ops] Error 2

linkinpark213 commented 5 years ago

I think I met with this problem earlier. In my case, context.h wasn't in caffe2/core/ but in other directories (I wonder why, too). Solved by adding this line to CMakeLists.txt:

include_directories("/path/to/pytorch/torch/lib/include")

If it doesn't help, try finding your own context.h in your Caffe2 path, and replace the path above with it.

rafikg commented 5 years ago

@linkinpark213 Thanks. I fixed this error but getting an other. I add these line in CmakeLists.txt file set(CMAKE_PREFIX_PATH "/home/rafikg/anaconda3/envs/facebook/lib/python2.7/site-packages/torch/share/cmake/Caffe2") set(CUDNN_INCLUDE_DIR "/usr/local/cuda/include") set(CUDNN_LIBRARY "/usr/local/cuda/lib64/libcudart.so") include_directories("/home/rafikg/anaconda3/envs/facebook/lib/python2.7/site-packages/torch/lib/include" "/home/rafikg/anaconda3/envs/facebook/include")

I get an other error:

**/home/rafikg/anaconda3/envs/facebook/lib/python2.7/site-packages/torch/lib/include/caffe2/utils/cblas.h:8:10: fatal error: mkl_cblas.h: No such file or directory

include **

This code makes all people crazy :(

linkinpark213 commented 5 years ago

@linkinpark213 Thanks. I fixed this error but getting an other. I add these line in CmakeLists.txt file set(CMAKE_PREFIX_PATH "/home/rafikg/anaconda3/envs/facebook/lib/python2.7/site-packages/torch/share/cmake/Caffe2") set(CUDNN_INCLUDE_DIR "/usr/local/cuda/include") set(CUDNN_LIBRARY "/usr/local/cuda/lib64/libcudart.so") include_directories("/home/rafikg/anaconda3/envs/facebook/lib/python2.7/site-packages/torch/lib/include" "/home/rafikg/anaconda3/envs/facebook/include")

I get an other error:

/home/rafikg/anaconda3/envs/facebook/lib/python2.7/site-packages/torch/lib/include/caffe2/utils/cblas.h:8:10: fatal error: mkl_cblas.h: No such file or directory #include This code makes all people crazy :(

I met with this problem, too. Some variable named like USE_MKL or so was set to be ON but MKL wasn't installed at all. Therefore, I installed MKL and added its include path to CPATH. In my case it was:

export CPATH=$CPATH:/opt/intel/compilers_and_libraries_2019.1.144/linux/mkl/include

The exact path may vary according to the version of MKL and your configuration.

rafikg commented 5 years ago

Thanks for your reply. I resolved almost problem. Still having undefined symbol: _ZN6google8protobuf8internal10LogMessagelsEPKc even when I apply these instructions http://linkinpark213.com/2018/11/18/densepose-minesweeping/#2-7-Undefined-symbol-ZN6google8protobuf8internal9ArenaImpl28AllocateAlignedAndAddCleanupEmPFvPvE and another error when running simple_infer https://github.com/facebookresearch/DensePose/issues/33

linkinpark213 commented 5 years ago

Thanks for your reply. I resolved almost problem. Still having undefined symbol: _ZN6google8protobuf8internal10LogMessagelsEPKc even when I apply these instructions http://linkinpark213.com/2018/11/18/densepose-minesweeping/#2-7-Undefined-symbol-ZN6google8protobuf8internal9ArenaImpl28AllocateAlignedAndAddCleanupEmPFvPvE and another error when running simple_infer https://github.com/facebookresearch/DensePose/issues/33

Emmmm, you didn't forget about running make ops again, did you?

It was my carelessness not to notice that you installed Caffe2 with Anaconda. My Caffe2 had a protobuf submodule which was an older commit #2761122b, pushed on November 14th, 2017. I searched protobuf releases page and found out that the version number should be v3.5.0. Therefore, if you set PROTOBUF_LIB to your protobuf.a which was installed with Anaconda, I guess you'll need to make it v3.5.0.

As for the other issue, I have no idea yet. Maybe try the others' solutions?

rafikg commented 5 years ago

@linkinpark213 Sure, I run make ops and installed protobuf=3.5.0 and still having the same error

linkinpark213 commented 5 years ago

@Gouiaa I suppose that there are some clashes among multiple protobufs and detectrons. Did you install other versions of protobuf with apt-get or pip?

ckyleda commented 5 years ago

I have this exact same issue with only one version of protobuf (3.5.0) installed.

@linkinpark213

There are no other versions of the protobuf library installed.

What is going on here? There is no protobuf library packaged with the anaconda install of pytorch( which obviously there should be) and inspecting the caffe libraries with ldd reveal on dependency on libprotobuf at all!

------ EDIT -------

Okay, so it seems that PyTorch no longer depends on protobuf at all.

https://github.com/pytorch/pytorch/commit/997df9a6ec4e982b68bd952ccadc7fc453e2eaca

Additionally, it now seems that libprotobuf is built from version 3.5.0 and embedded inside the libcaffe library.

https://github.com/pytorch/pytorch/search?q=BUILD_CUSTOM_PROTOBUF&unscoped_q=BUILD_CUSTOM_PROTOBUF

So where is this mismatch coming from if we also build DensePose with protobuf 3.5.0 ?

linkinpark213 commented 5 years ago

Please calm down. As you mentioned,

I have this EXACT same issue with only one version of protobuf (3.5.0) installed.

Did you mean OSError: .../build/libcaffe2_detectron_custom_ops_gpu.so: undefined symbol: _ZN6google8protobuf8internal9ArenaImpl28AllocateAlignedAndAddCleanupEmPFvPvE?

If so, @Gouiaa and I solved this issue with PyTorch built from source. The source code includes the Detectron module and third party dependencies (also protobuf, even now), while the anaconda installation doesn't. Therefore, you may have to build PyTorch by yourself.

I strongly agree that the developers should work on how to make the installation procedure less painful, though.

ckyleda commented 5 years ago

@linkinpark213

Did you ever encounter this issue while building Pytorch?

https://github.com/pytorch/pytorch/issues/15433#issuecomment-449408937

ckyleda commented 5 years ago

I have now built PyTorch and confirmed caffe2 working, and referenced the built libprotobuf (from pytorch) in the CMakeLists file and rebuilt (make ops) and can confirm that this does not resolve the issue.

libcaffe2_detectron_custom_ops_gpu.so: undefined symbol: _ZN2at19UndefinedTensorImpl10_singletonE

Is there any other way to move forward?

rafikg commented 5 years ago

@ckyleda This code made me crazy for a while. I finished by getting it working but I did not get the frame rate mentioned in the paper. So, I opened an issue on Github and I tried to meet someone from Facebook in last Neurips conference that can explain this issue, but no answer. me @Gouiaa , @linkinpark213 and someone else form Intel (meet him in Neurips 2018) run this code on different devices better than the GPU mentioned in the paper and we did not get the fps mentioned in the paper 20-25 fps!! So, what's next?

ckyleda commented 5 years ago

@Gouiaa @linkinpark213

I have now resolved this. For future reference:

If you have followed instructions to get the Anaconda version of Pytorch working with Densepose, then switch to the built version, you must remove the additional line in the CMakeLists file that added the source files to resolve the 'ThreadPool.h missing' error as shown in:

https://linkinpark213.com/2018/11/18/densepose-minesweeping/#2-8-fatal-error-caffe2-utils-threadpool-ThreadPool-h-No-such-file-or-directory

This was a ridiculous journey.

Thanks to @linkinpark213 for his comprehensive blog and github posts.

@Gouiaa I agree that this kind of installation process is insane. Nearly nothing works out of the box and the standard supplied guide is essentially useless.

Unfortunately I'm not sure how to help you achieve a better framerate - is your GPU definitely being utilised?