6 0x00007fff108b895f in at::max_pool2d (ceil_mode=false, dilation=..., padding=..., stride=..., kernel_size=..., self=...) at /home/lufang/gitrepos/onnx-pytorch/pytorch/torch/lib/tmp_install/include/ATen/Functions.h:2038
7 torch::autograd::dispatch_max_pool2d (ceil_mode=false, dilation=..., padding=..., stride=..., kernel_size=..., self=...) at torch/csrc/autograd/generated/python_nn_functions_dispatch.h:140
Apparently, some function implementations related to ATen from Caffe2 are exposed. And PyTorch accidentally loaded and invoked it.
One temporary solution (given by @dzhulgakov) is remove "cytpes.RTLD_GLOBAL" flag from extension_loader.py in Caffe2. Not sure whether this will break other parts of Caffe2.
I create this issue to track this problem until it gets fixed.
Both PyTorch and Caffe2 contain ATen library. If we don't link them carefully, we may run into some issues.
For example, here are some stack traces:
Thread 1 "python" hit Catchpoint 1 (exception thrown), 0x00007fffeca15c1d in __cxa_throw () from /lib64/libstdc++.so.6 (gdb) bt
0 0x00007fffeca15c1d in __cxa_throw () from /lib64/libstdc++.so.6
1 0x00007ffeffe81a31 in at::runtime_error(char const*, ...) () from /home/lufang/gitrepos/onnx-pytorch/pytorch/torch/lib/libATen.so.1
2 0x00007ffeffe815e2 in at::UndefinedTensor::storage() () from /home/lufang/gitrepos/onnx-pytorch/pytorch/torch/lib/libATen.so.1
3 0x00007fff851894e3 in std::_Tuple_impl<0ul, at::Tensor, at::Tensor>::~_Tuple_impl() () from /home/lufang/programs/caffe2/lib/libcaffe2_gpu.so
4 0x00007fff1081d838 in std::tuple<at::Tensor, at::Tensor>::~tuple (this=0x7fffffffc720, __in_chrg=) at /usr/include/c++/4.8.2/tuple:523
5 torch::autograd::VariableType::max_pool2d (this=0x1ee4830, self=..., kernel_size=..., stride=..., padding=..., dilation=..., ceil_mode=false) at torch/csrc/autograd/generated/VariableType.cpp:6602
6 0x00007fff108b895f in at::max_pool2d (ceil_mode=false, dilation=..., padding=..., stride=..., kernel_size=..., self=...) at /home/lufang/gitrepos/onnx-pytorch/pytorch/torch/lib/tmp_install/include/ATen/Functions.h:2038
7 torch::autograd::dispatch_max_pool2d (ceil_mode=false, dilation=..., padding=..., stride=..., kernel_size=..., self=...) at torch/csrc/autograd/generated/python_nn_functions_dispatch.h:140
Apparently, some function implementations related to ATen from Caffe2 are exposed. And PyTorch accidentally loaded and invoked it.
One temporary solution (given by @dzhulgakov) is remove "cytpes.RTLD_GLOBAL" flag from extension_loader.py in Caffe2. Not sure whether this will break other parts of Caffe2.
I create this issue to track this problem until it gets fixed.
cc: @soumith @ezyang @zdevito @dzhulgakov @bddppq