princeton-vl / CornerNet-Lite

BSD 3-Clause "New" or "Revised" License
1.78k stars 431 forks source link

_cpools error #138

Open Hello526 opened 4 years ago

Hello526 commented 4 years ago

Hi, When I executed test.py without any errors, but when I executed train.py, I encountered the following error: Traceback (most recent call last): File "/mnt/pycharm/Duan/train.py", line 203, in train(training_dbs, validation_db, args.start_iter) File "/mnt/pycharm/Duan/train.py", line 138, in train training_loss, focal_loss, pull_loss, push_loss, regr_loss = nnet.train(*training) File "/mnt/pycharm/Duan/nnet/py_factory.py", line 93, in train loss.backward() File "/root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/tensor.py", line 166, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/autograd/init.py", line 99, in backward allow_unreachable=True) # allow_unreachable flag File "/root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/autograd/function.py", line 77, in apply return self._forward_cls.backward(self, args) File "/mnt/pycharm/Duan/models/py_utils/_cpools/init.py", line 57, in backward output = right_pool.backward(input, grad_output.to(torch.bool))[0] RuntimeError: The output tensor of lt must be a bool, but was Byte (comparison_op_out at /pytorch/aten/src/ATen/native/BinaryOps.cpp:235) frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x33 (0x7f519aec5813 in /root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/lib/libc10.so) frame #1: + 0x1950cf1 (0x7f519cc68cf1 in /root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/lib/libtorch.so) frame #2: + 0x41bc418 (0x7f519f4d4418 in /root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/lib/libtorch.so) frame #3: + 0x3b6ee7e (0x7f519ee86e7e in /root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/lib/libtorch.so) frame #4: pool_backward(at::Tensor, at::Tensor) + 0x595 (0x7f5191d86135 in /root/.local/lib/python3.7/site-packages/cpools-0.0.0-py3.7-linux-x86_64.egg/right_pool.cpython-37m-x86_64-linux-gnu.so) frame #5: + 0x1c651 (0x7f5191d97651 in /root/.local/lib/python3.7/site-packages/cpools-0.0.0-py3.7-linux-x86_64.egg/right_pool.cpython-37m-x86_64-linux-gnu.so) frame #6: + 0x1d52a (0x7f5191d9852a in /root/.local/lib/python3.7/site-packages/cpools-0.0.0-py3.7-linux-x86_64.egg/right_pool.cpython-37m-x86_64-linux-gnu.so) frame #7: _PyMethodDef_RawFastCallKeywords + 0x264 (0x55c8594ea114 in /root/miniconda3/envs/myconda/bin/python3.7) frame #8: _PyCFunction_FastCallKeywords + 0x21 (0x55c8594ea231 in /root/miniconda3/envs/myconda/bin/python3.7) frame #9: _PyEval_EvalFrameDefault + 0x4e9d (0x55c85954ea5d in /root/miniconda3/envs/myconda/bin/python3.7) frame #10: _PyFunction_FastCallDict + 0x10b (0x55c8594a473b in /root/miniconda3/envs/myconda/bin/python3.7) frame #11: _PyEval_EvalFrameDefault + 0x1e35 (0x55c85954b9f5 in /root/miniconda3/envs/myconda/bin/python3.7) frame #12: _PyEval_EvalCodeWithName + 0x2f9 (0x55c8594a36f9 in /root/miniconda3/envs/myconda/bin/python3.7) frame #13: _PyFunction_FastCallDict + 0x1d5 (0x55c8594a4805 in /root/miniconda3/envs/myconda/bin/python3.7) frame #14: _PyObject_Call_Prepend + 0x63 (0x55c8594bf943 in /root/miniconda3/envs/myconda/bin/python3.7) frame #15: PyObject_Call + 0x6e (0x55c8594b2b9e in /root/miniconda3/envs/myconda/bin/python3.7) frame #16: torch::autograd::PyNode::apply(std::vector<torch::autograd::Variable, std::allocator >&&) + 0x178 (0x7f51e30339a8 in /root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/lib/libtorch_python.so) frame #17: + 0x3d4ae06 (0x7f519f062e06 in /root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/lib/libtorch.so) frame #18: torch::autograd::Engine::evaluate_function(torch::autograd::NodeTask&) + 0x10b7 (0x7f519f05c417 in /root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/lib/libtorch.so) frame #19: torch::autograd::Engine::thread_main(torch::autograd::GraphTask*) + 0x1c4 (0x7f519f05e424 in /root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/lib/libtorch.so) frame #20: torch::autograd::python::PythonEngine::thread_init(int) + 0x2a (0x7f51e302ce6a in /root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/lib/libtorch_python.so) frame #21: + 0xf14f (0x7f51e3b5c14f in /root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/_C.cpython-37m-x86_64-linux-gnu.so) frame #22: + 0x76db (0x7f51f2b466db in /lib/x86_64-linux-gnu/libpthread.so.0) frame #23: clone + 0x3f (0x7f51f286f88f in /lib/x86_64-linux-gnu/libc.so.6)

Hello526 commented 4 years ago

torch1.3.1
torchvision0.4.2 CUDA10.0 gcc7.5

SohamTamba commented 4 years ago

I also have this error and my libraries versions are simillar. Would appreciate an answer to this.

Can someone tell what library versions they used for this to be successful?

tianyuandu commented 4 years ago

I also have this error and my libraries versions are simillar. Would appreciate an answer to this.

Can someone tell what library versions they used for this to be successful?

This code works for pytorch 1.0.0, if you want to use center pool in pytorch 1.3, you can try to replace kByte to kBool in *_pool.cpp and recompile. auto gt_mask = torch::zeros({batch, channel, width}, at::device(at::kCUDA).dtype(at::kByte)); change to auto gt_mask = torch::zeros({batch, channel, width}, at::device(at::kCUDA).dtype(at::kBool));

Hope that works for you.