facebookresearch / Detectron

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
Apache License 2.0
26.27k stars 5.45k forks source link

Question about `test_throws_on_non_1D_arrays` in `test_zero_even_op.py`? #754

Open yuchen-xue opened 6 years ago

yuchen-xue commented 6 years ago

Expected results

To fully pass the test_zero_even_op.py test.

Actual results

  1. Firstly, I ran test_zero_even_op.py as usual, but I failed with system output listed below:
    
    <ENV_NAME>$ python detectron/tests/test_zero_even_op.py
    [E init_intrinsics_check.cc:43] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may notget the full speed of your CPU.
    [E init_intrinsics_check.cc:43] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
    [E init_intrinsics_check.cc:43] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may notget the full speed of your CPU.
    .....detectron/tests/test_zero_even_op.py:87: DeprecationWarning: Please use assertRaisesRegex instead.
    with self.assertRaisesRegexp(RuntimeError, 'X\.ndim\(\) == 1'):
    ......F
    ======================================================================
    FAIL: test_throws_on_non_1D_arrays (__main__.ZeroEvenOpTest)
    ----------------------------------------------------------------------
    RuntimeError: [enforce fail at zero_even_op.cc:25] X.dim() == 1.
    Error from operator:
    input: "X" output: "Y" name: "" type: "ZeroEven"frame #0: c10::ThrowEnforceNotMet(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, void const*) + 0x76 (0x7f2c21796e06 in /home/ee303/miniconda3/envs/dat-src-py37-new-pytorch/lib/python3.7/site-packages/caffe2/python/../../torch/lib/libc10.so)
    frame #1: caffe2::ZeroEvenOp<float, caffe2::CPUContext>::RunOnDevice() + 0x17a (0x7f2be4aec2aa in /home/ee303/WORKSPACE/POSE/DetectAndTrack/Detectron/build/libcaffe2_detectron_custom_ops_gpu.so)
    frame #2: caffe2::Operator<caffe2::CPUContext>::Run(int) + 0x5b (0x7f2be4af6f8b in /home/ee303/WORKSPACE/POSE/DetectAndTrack/Detectron/build/libcaffe2_detectron_custom_ops_gpu.so)
    frame #3: caffe2::Workspace::RunOperatorOnce(caffe2::OperatorDef const&) + 0x4a (0x7f2c335ad68a in /home/ee303/miniconda3/envs/dat-src-py37-new-pytorch/lib/python3.7/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
    frame #4: <unknown function> + 0x5bb58 (0x7f2c34919b58 in /home/ee303/miniconda3/envs/dat-src-py37-new-pytorch/lib/python3.7/site-packages/caffe2/python/caffe2_pybind11_state_gpu.cpython-37m-x86_64-linux-gnu.so)
    frame #5: <unknown function> + 0x5bd04 (0x7f2c34919d04 in /home/ee303/miniconda3/envs/dat-src-py37-new-pytorch/lib/python3.7/site-packages/caffe2/python/caffe2_pybind11_state_gpu.cpython-37m-x86_64-linux-gnu.so)
    frame #6: <unknown function> + 0x95970 (0x7f2c34953970 in /home/ee303/miniconda3/envs/dat-src-py37-new-pytorch/lib/python3.7/site-packages/caffe2/python/caffe2_pybind11_state_gpu.cpython-37m-x86_64-linux-gnu.so)
    <omitting python frames>

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "detectron/tests/test_zero_even_op.py", line 51, in test_throws_on_non_1D_arrays self._run_zero_even_op(X) AssertionError: "X.ndim() == 1" does not match "[enforce fail at zero_even_op.cc:25] X.dim() == 1. Error from operator: input: "X" output: "Y" name: "" type: "ZeroEven"frame #0: c10::ThrowEnforceNotMet(char const, int, char const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, void const*) + 0x76 (0x7f2c21796e06 in /home/ee303/miniconda3/envs/dat-src-py37-new-pytorch/lib/python3.7/site-packages/caffe2/python/../../torch/lib/libc10.so) frame #1: caffe2::ZeroEvenOp<float, caffe2::CPUContext>::RunOnDevice() + 0x17a (0x7f2be4aec2aa in /home/ee303/WORKSPACE/POSE/DetectAndTrack/Detectron/build/libcaffe2_detectron_custom_ops_gpu.so) frame #2: caffe2::Operator::Run(int) + 0x5b (0x7f2be4af6f8b in /home/ee303/WORKSPACE/POSE/DetectAndTrack/Detectron/build/libcaffe2_detectron_custom_ops_gpu.so) frame #3: caffe2::Workspace::RunOperatorOnce(caffe2::OperatorDef const&) + 0x4a (0x7f2c335ad68a in /home/ee303/miniconda3/envs/dat-src-py37-new-pytorch/lib/python3.7/site-packages/caffe2/python/../../torch/lib/libcaffe2.so) frame #4: + 0x5bb58 (0x7f2c34919b58 in /home/ee303/miniconda3/envs/dat-src-py37-new-pytorch/lib/python3.7/site-packages/caffe2/python/caffe2_pybind11_state_gpu.cpython-37m-x86_64-linux-gnu.so) frame #5: + 0x5bd04 (0x7f2c34919d04 in /home/ee303/miniconda3/envs/dat-src-py37-new-pytorch/lib/python3.7/site-packages/caffe2/python/caffe2_pybind11_state_gpu.cpython-37m-x86_64-linux-gnu.so) frame #6: + 0x95970 (0x7f2c34953970 in /home/ee303/miniconda3/envs/dat-src-py37-new-pytorch/lib/python3.7/site-packages/caffe2/python/caffe2_pybind11_state_gpu.cpython-37m-x86_64-linux-gnu.so)

" ---------------------------------------------------------------------- Ran 12 tests in 1.022s FAILED (failures=1) ``` ---- 2. Then I comment out the `test_throws_on_non_1D_arrays` function in this testing file, which is: https://github.com/facebookresearch/Detectron/blob/8181a324796202e4afe7660b7458b7bf1e08cf8b/detectron/tests/test_zero_even_op.py#L48-L51 The test passed! The result is: ```bash $ python detectron/tests/test_zero_even_op.py [E init_intrinsics_check.cc:43] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may notget the full speed of your CPU. [E init_intrinsics_check.cc:43] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU. [E init_intrinsics_check.cc:43] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may notget the full speed of your CPU. .....detectron/tests/test_zero_even_op.py:87: DeprecationWarning: Please use assertRaisesRegex instead. with self.assertRaisesRegexp(RuntimeError, 'X\.ndim\(\) == 1'): ...... ---------------------------------------------------------------------- Ran 11 tests in 1.070s OK ``` ### Detailed steps to reproduce ``` # spawn a new `conda` env $ conda create -n python $ source activate # install `caffe2` $ cd pytorch $ python setup.py install # install `Detectron` $ cd ../Detectron $ make && make ops ``` Then do the thing as shown in the **Actual results** part. ### System information * Operating system: Ubuntu 16.04 * Compiler version: gcc 5.4.0 * CUDA version: 9.0 * cuDNN version: 7.0.5 * NVIDIA driver version: 384.130 * GPU models (for all devices if they are not all the same): GTX1070 * `PYTHONPATH` environment variable: __ * `python --version` output: 3.7.1 ### So my question Why I remove the `test_throws_on_non_1D_arrays` function, so that the test passed?
shellhue commented 5 years ago

I meet the same problem. Comment out the test_throws_on_non_1D_arrays, everything will be ok!

marcobevih2o commented 5 years ago

i had the same problem. any news? my output `Traceback (most recent call last): File "/detectron/detectron/tests/test_zero_even_op.py", line 51, in test_throws_on_non_1D_arrays self._run_zero_even_op(X) AssertionError: "X.ndim() == 1" does not match "[enforce fail at zero_even_op.cc:25] X.dim() == 1. Error from operator: input: "X" output: "Y" name: "" type: "ZeroEven"frame #0: c10::ThrowEnforceNotMet(char const, int, char const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, void const*) + 0x68 (0x7f7ae88854f8 in /pytorch/build/lib/libc10.so) frame #1: caffe2::ZeroEvenOp<float, caffe2::CPUContext>::RunOnDevice() + 0x22b (0x7f7a8b6771ab in /detectron/build/libcaffe2_detectron_custom_ops_gpu.so) frame #2: caffe2::Operator::Run(int) + 0x92 (0x7f7a8b680912 in /detectron/build/libcaffe2_detectron_custom_ops_gpu.so) frame #3: caffe2::Workspace::RunOperatorOnce(caffe2::OperatorDef const&) + 0x42 (0x7f7ad31d0022 in /pytorch/build/lib/libcaffe2.so) frame #4: + 0x5a378 (0x7f7ae8f77378 in /pytorch/build/caffe2/pytho/caffe2_pybind11_state_gpu.so) frame #5: + 0x5a524 (0x7f7ae8f77524 in /pytorch/build/caffe2/python/caffe2_pybind11_state_gpu.so) frame #6: + 0x93990 (0x7f7ae8fb0990 in /pytorch/build/caffe2/python/caffe2_pybind11_state_gpu.so)

frame #15: python() [0x4d57a3] frame #19: python() [0x4d5669] frame #20: python() [0x4eef5e] frame #22: python() [0x548fc3] frame #25: python() [0x4d57a3] frame #29: python() [0x4d5669] frame #30: python() [0x4eef5e] frame #32: python() [0x548fc3] frame #35: python() [0x4d57a3] frame #39: python() [0x4d5669] frame #40: python() [0x4eef5e] frame #42: python() [0x548fc3] frame #47: python() [0x4d5669] frame #48: python() [0x4eef5e] frame #49: python() [0x4eeb66] frame #50: python() [0x4aaafb] frame #53: python() [0x4eb69f] frame #57: __libc_start_main + 0xf0 (0x7f7aec66f830 in /lib/x86_64-linux-gnu/libc.so.6) ` System information Operating system: Ubuntu 16.04 Compiler version: gcc 5.4.0 CUDA version: 9.0 cuDNN version: 7.4 NVIDIA driver version: 390.0 GPU models (for all devices if they are not all the same): K2200 PYTHONPATH environment variable: python --version output: 2.7.1
yzhq97 commented 5 years ago

Here I quote this awesome blog post from Harper Long [MineSweeping] The Long Struggle of DensePose Installation

Cause

As can be seen from the messy undefined symbol, this should have something to do with Caffe2 and probably CXX11(oh really???).

Run ldd -r /path/to/densepose/build/libcaffe2_detectron_custom_ops.so and the one or several undefined symbols with similar names will be shown, which should have been defined in libcaffe2.so. After running strings -a /path/to/pytorch/torch/lib/libcaffe2.so | grep _ZN6caffe219CPUOperator, a few similar symbols (two, in my case) would come up, but are different from the one undefined - "B5cxx11" is missing.

Why does DensePose want to find a symbol with "B5cxx11"? Who added this suffix? It should be our GCC who did it when compiling DensePose with C++11 standard!

In short, your GCC version is wrong. run strings -a /path/to/pytorch/torch/lib/libcaffe2.so | grep GCC: to see with which version your Caffe2 is built

GCC: (GNU) 4.9.2 20150212 (Red Hat 4.9.2-6)

You should either install the specific version of GCC, or compile Caffe2 yourself. Check this post for how to install GCC-4.9.2.