facebookresearch / Detectron

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
Apache License 2.0
26.24k stars 5.45k forks source link

Caffe2 include path is wrong in `make ops`. Now what? #708

Open zakdances opened 5 years ago

zakdances commented 5 years ago

Expected results

What did you expect to see?

make ops run without errors

Actual results

make ops failed with error caffe2/core/context.h: No such file or directory

What did you observe instead?

mkdir -p build && cd build && cmake .. && make -j2
-- Caffe2: CUDA detected: 9.0
-- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda
-- Caffe2: Header version is: 9.0
-- Found cuDNN: v7.3.1  (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libcudnn.so)
-- Autodetected CUDA architecture(s):  3.7
-- Added CUDA NVCC flags for: -gencode;arch=compute_37,code=sm_37
-- Summary:
--   CMake version        : 3.12.2
--   CMake command        : /opt/conda/bin/cmake
--   System name          : Linux
--   C++ compiler         : /usr/bin/c++
--   C++ compiler version : 5.4.0
--   CXX flags            :  -std=c++11 -O2 -fPIC -Wno-narrowing
--   Caffe2 version       : 1.0.0
--   Caffe2 include path  : /opt/conda/envs/py3/lib/python3.7/site-packages/torch/include
--   Caffe2 found CUDA    : True
--     CUDA version       : 9.0
--     CuDNN version      : 7.3.1
-- Configuring done
-- Generating done
-- Build files have been written to: /app/detectron/build

In file included from /app/detectron/detectron/ops/zero_even_op.cc:17:0:
/app/detectron/detectron/ops/zero_even_op.h:20:33: fatal error: caffe2/core/context.h: No such file or directory
compilation terminated.
CMakeFiles/caffe2_detectron_custom_ops.dir/build.make:62: recipe for target 'CMakeFiles/caffe2_detectron_custom_ops.dir/detectron/ops/zero_even_op.cc.o' failed
make[3]: *** [CMakeFiles/caffe2_detectron_custom_ops.dir/detectron/ops/zero_even_op.cc.o] Error 1
make[3]: Leaving directory '/app/detectron/build'
CMakeFiles/Makefile2:72: recipe for target 'CMakeFiles/caffe2_detectron_custom_ops.dir/all' failed
make[2]: *** [CMakeFiles/caffe2_detectron_custom_ops.dir/all] Error 2
make[2]: *** Waiting for unfinished jobs....
/app/detectron/detectron/ops/zero_even_op.cu:17:37: fatal error: caffe2/core/context_gpu.h: No such file or directory
compilation terminated.
CMake Error at caffe2_detectron_custom_ops_gpu_generated_zero_even_op.cu.o.cmake:219 (message):
  Error generating
  /app/detectron/build/CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/./caffe2_detectron_custom_ops_gpu_generated_zero_even_op.cu.o

CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/build.make:63: recipe for target 'CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/caffe2_detectron_custom_ops_gpu_generated_zero_even_op.cu.o' failed
make[3]: *** [CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/detectron/ops/caffe2_detectron_custom_ops_gpu_generated_zero_even_op.cu.o] Error 1
make[3]: Leaving directory '/app/detectron/build'
CMakeFiles/Makefile2:109: recipe for target 'CMakeFiles/caffe2_detectron_custom_ops_gpu.dir/all' failed

Detailed steps to reproduce

E.g.:

make ops

System information

ir413 commented 5 years ago

Thanks for reporting this @zakdances. I reproduced an issue when building custom ops with Caffe2 from latest pytorch-nightly anaconda package. However, it is a different one. What is the CMake version you are using? Could you also clarify if by official anaconda build you mean pytorch-nightly?

zakdances commented 5 years ago

@ir413 Thanks for looking into this. I actually made an annotated Dockerfile which demonstrates this bug. Make sure docker is running and type

git clone https://github.com/WhoDATinc/detectron-crash-test.git && cd detectron-crash-test && docker build .

If you want to try make ops yourself with bash, you can instead run docker build . --build-args make_ops=false to bypass the command which triggers the crash.

As you can see in the Dockerfile, I'm updating cmake to the latest version (3.12) and using conda install pytorch-nightly -c pytorch which is the official install command for Caffe2. What error message is causing your crash?

htzheng commented 5 years ago

I had the same issue when make ops, couldn't find a way out

[ 40%] Building CXX object CMakeFiles/caffe2_detectron_custom_ops.dir/detectron/ops/zero_even_op.cc.o In file included from /fileserver/haitian/Cloth/detectron/detectron/ops/zero_even_op.cc:17:0: /fileserver/haitian/Cloth/detectron/detectron/ops/zero_even_op.h:20:33: fatal error: caffe2/core/context.h: No such file or directory

include "caffe2/core/context.h"

                             ^

compilation terminated.

jgbos commented 5 years ago

I get a similar error that the compiler cannot find the include files even thought the Caffe2 include path is set correctly. I'm using today's nightly pytorch build and executing make ops. The nightly build includes the libcaffe2_detectron_ops_gpu.so in

/home/justin/.conda/envs/pytorch-1.0/lib/python3.7/site-packages/torch/lib

-- Summary:
--   CMake version        : 3.13.0-rc3
--   CMake command        : /home/justin/.usr/local/cmake-3.13.0-rc3-Linux-x86_64/bin/cmake
--   System name          : Linux
--   C++ compiler         : /home/justin/.usr/local/gcc/bin/g++
--   C++ compiler version : 5.2.0
--   CXX flags            :  -std=c++11 -O2 -fPIC -Wno-narrowing
--   Caffe2 version       : 1.0.0
--   Caffe2 include path  : /home/justin/.conda/envs/pytorch-1.0/lib/python3.7/site-packages/torch/lib/include
--   Caffe2 found CUDA    : True
--     CUDA version       : 9.0
--     CuDNN version      : 7.2.1
-- Configuring done
-- Generating done
zakdances commented 5 years ago

@ir413 Any movement on this per chance?

bearpaw commented 5 years ago

I meet the same problem with the latest pytorch-nightly: pytorch-nightly-1.0.0.dev20190116-py2.7_cuda10.0.130_cudnn7.4.1_0

The Caffe2 include path is wrong

--   Caffe2 include path  : /home/wei/anaconda2/pkgs/pytorch-nightly-1.0.0.dev20190116-py2.7_cuda10.0.130_cudnn7.4.1_0/lib/python2.7/site-packages/torch/include

while the correct path should be

/home/wei/anaconda2/pkgs/pytorch-nightly-1.0.0.dev20190116-py2.7_cuda10.0.130_cudnn7.4.1_0/lib/python2.7/site-packages/torch/lib/include

Any suggestion?