Closed onlytailei closed 6 years ago
I'll check out the code, but I don't think this is a bug with C++ extensions per se. Did you run tests of your C++ code before binding it into Python, and made sure it was generally correct and does not segfault?
Yes. I tried the pure C++ example. There is no segfault. You can download this C++ example from here.
mkdir build
cd build
cmake ../
make
./iterative_closest_point
https://github.com/onlytailei/icp_extension/blob/master/icp_op.cpp#L88 looks wrong to me. tensorFromBlob
does not copy data. It only references the blob you give it. You have to call .clone()
to deep copy the data.
Change
at::Tensor output = torch::CPU(at::kFloat).tensorFromBlob(output_array, {batch_size,p_cloud_size, 3});
to
at::Tensor output = torch::CPU(at::kFloat).tensorFromBlob(output_array, {batch_size,p_cloud_size, 3}).clone();
or even better, to
at::Tensor output = torch::from_blob(output_array, {batch_size, p_cloud_size, 3}).clone();
Thank you! And any idea about this line?
pcl::IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ> icp;
As soon as you uncomment this one, the segfault happen. In the cpp_example, it is fine.
From the debug info, it seems that some attributes of icp cannot be released successfully through boost smart pointer. However, I have no idea why nothing is wrong in pure cpp example. Maybe there is some conflict between boost and torch.
0 0x00007fffba96fd12 in boost::detail::atomic_exchange_and_add (dv=-1, pw=0x656572546453) at /usr/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:50
1 boost::detail::sp_counted_base::release (this=0x65657254644b) at /usr/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:144
2 boost::detail::shared_count::~shared_count (this=0x13d4660, __in_chrg=
) at /usr/include/boost/smart_ptr/detail/shared_count.hpp:443 3 boost::shared_ptr<pcl::KdTreeFLANN<pcl::PointXYZ, flann::L2_Simple
> >::~shared_ptr (this=0x13d4658, __in_chrg= ) at /usr/include/boost/smart_ptr/shared_ptr.hpp:323
4 pcl::search::KdTree<pcl::PointXYZ, pcl::KdTreeFLANN<pcl::PointXYZ, flann::L2_Simple
> >::~KdTree (this=0x13d4620, __in_chrg= ) at /usr/local/include/pcl-1.8/pcl/search/kdtree.h:99
5 pcl::search::KdTree<pcl::PointXYZ, pcl::KdTreeFLANN<pcl::PointXYZ, flann::L2_Simple
> >::~KdTree (this=0x13d4620, __in_chrg= ) at /usr/local/include/pcl-1.8/pcl/search/kdtree.h:99
6 boost::checked_delete<pcl::search::KdTree<pcl::PointXYZ, pcl::KdTreeFLANN<pcl::PointXYZ, flann::L2_Simple
> > > (x=0x13d4620) at /usr/include/boost/core/checked_delete.hpp:34
7 boost::detail::sp_counted_impl_p<pcl::search::KdTree<pcl::PointXYZ, pcl::KdTreeFLANN<pcl::PointXYZ, flann::L2_Simple
> > >::dispose (this= ) at /usr/include/boost/smart_ptr/detail/sp_counted_impl.hpp:78
8 0x00007fffba96b8fa in boost::detail::sp_counted_base::release (this=0x13d4570) at /usr/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:146
9 0x00007fffba976c5d in boost::detail::sp_counted_base::release (this=
) at /usr/local/include/pcl-1.8/pcl/registration/correspondence_estimation.h:109 10 boost::detail::shared_count::~shared_count (this=0x13b3950, __in_chrg=
) at /usr/include/boost/smart_ptr/detail/shared_count.hpp:443 11 boost::shared_ptr<pcl::search::KdTree<pcl::PointXYZ, pcl::KdTreeFLANN<pcl::PointXYZ, flann::L2_Simple
> > >::~shared_ptr (this=0x13b3948, __in_chrg=
) at /usr/include/boost/smart_ptr/shared_ptr.hpp:323 12 pcl::registration::CorrespondenceEstimationBase<pcl::PointXYZ, pcl::PointXYZ, float>::~CorrespondenceEstimationBase (this=this@entry=0x13b3900,
__in_chrg=
) at /usr/local/include/pcl-1.8/pcl/registration/correspondence_estimation.h:109 13 0x00007fffba976d10 in pcl::registration::CorrespondenceEstimation<pcl::PointXYZ, pcl::PointXYZ, float>::~CorrespondenceEstimation (this=0x13b3900,
__in_chrg=
) at /usr/local/include/pcl-1.8/pcl/registration/correspondence_estimation.h:419 14 pcl::registration::CorrespondenceEstimation<pcl::PointXYZ, pcl::PointXYZ, float>::~CorrespondenceEstimation (this=0x13b3900, __in_chrg=
) at /usr/local/include/pcl-1.8/pcl/registration/correspondence_estimation.h:419
15 boost::checked_delete<pcl::registration::CorrespondenceEstimation<pcl::PointXYZ, pcl::PointXYZ, float> > (x=0x13b3900)
at /usr/include/boost/core/checked_delete.hpp:34
16 boost::detail::sp_counted_impl_p<pcl::registration::CorrespondenceEstimation<pcl::PointXYZ, pcl::PointXYZ, float> >::dispose (this=
) at /usr/include/boost/smart_ptr/detail/sp_counted_impl.hpp:78
17 0x00007fffba96b8fa in boost::detail::sp_counted_base::release (this=0x13d4590) at /usr/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:146
18 0x00007fffba973ed5 in boost::detail::sp_counted_base::release (this=
) at /usr/include/boost/function/function_template.hpp:510 19 boost::detail::shared_count::~shared_count (this=0x13bfa50, __in_chrg=
) at /usr/include/boost/smart_ptr/detail/shared_count.hpp:443 20 boost::shared_ptr<pcl::registration::CorrespondenceEstimationBase<pcl::PointXYZ, pcl::PointXYZ, float> >::~shared_ptr (this=0x13bfa48, __in_chrg=
) at /usr/include/boost/smart_ptr/shared_ptr.hpp:323
21 pcl::Registration<pcl::PointXYZ, pcl::PointXYZ, float>::~Registration (this=this@entry=0x13bf8c0, __in_chrg=
) at /usr/local/include/pcl-1.8/pcl/registration/registration.h:132
22 0x00007fffba973fc5 in pcl::IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ, float>::~IterativeClosestPoint (this=0x13bf8c0, __in_chrg=
) at /usr/local/include/pcl-1.8/pcl/registration/icp.h:155
23 pcl::IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ, float>::~IterativeClosestPoint (this=0x13bf8c0, __in_chrg=
) at /usr/local/include/pcl-1.8/pcl/registration/icp.h:155
24 0x00007fffba96d062 in boost::movelib::default_delete<pcl::IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ, float> >::operator()<pcl::IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ, float> > (this=
, ptr= ) at /usr/include/boost/move/default_delete.hpp:181 25 boost::movelib::unique_ptr<pcl::IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ, float>, boost::movelib::default_delete<pcl::IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ, float> > >::~unique_ptr (this=
, __in_chrg= ) at /usr/include/boost/move/unique_ptr.hpp:559 26 icp_forward (p_cloud=..., q_cloud=...) at icp_op.cpp:77
Another problem is that you mentioned torch::from_blob. Which extra header file should I include to use it? With torch/torch.h, it cannot find this function.
For the from_blob
: That was introduced in a later version of PyTorch, it wasn't available in 0.4.0 -- my bad.
I spent some time today trying to reproduce your bug in a docker container but I could not. I use this Dockerfile
:
FROM ubuntu:xenial
RUN apt-get update -y \
&& apt-get install -y git cmake vim make wget gnupg build-essential software-properties-common gdb
RUN apt-get install -y libpcl-dev
RUN wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh \
&& chmod +x miniconda.sh \
&& ./miniconda.sh -b -p ~/local/miniconda
RUN . ~/local/miniconda/bin/activate && conda install -c pytorch pytorch==0.4.0
WORKDIR /home
and then everything seems to work pretty well:
(base) root@7354d80b0db7:/home# cat /home/
.git/ .gitignore Dockerfile README.md __pycache__/ icp.py icp_op.cpp icp_test.py setup.py
(base) root@7354d80b0db7:/home# cat /home/^C
(base) root@7354d80b0db7:/home# python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib())'
/root/local/miniconda/lib/python3.6/site-packages
(base) root@7354d80b0db7:/home# find /root/local/miniconda/lib/python3.6/site-packages -name cpp_extension.pty
(base) root@7354d80b0db7:/home# find /root/local/miniconda/lib/python3.6/site-packages -name cpp_extension.py
/root/local/miniconda/lib/python3.6/site-packages/torch/utils/cpp_extension.py
(base) root@7354d80b0db7:/home# ^Cnd /root/local/miniconda/lib/python3.6/site-packages -name cpp_extension.py
(base) root@7354d80b0db7:/home# less /root/local/miniconda/lib/python3.6/site-packages/torch/utils/cpp_extension.py
(base) root@7354d80b0db7:/home# ls
Dockerfile README.md __pycache__ icp.py icp_op.cpp icp_test.py setup.py
(base) root@7354d80b0db7:/home# python setup.py build develop
running build
running build_ext
building 'icp_cpp' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/local/miniconda/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DEIGEN_YES_I_KNOW_SPARSE_MODULE_IS_NOT_STABLE_YET=1 -I/root/local/miniconda/lib/python3.6/site-packages/numpy/core/include -I/usr/include/pcl-1.7 -I/usr/include/ni -I/usr/include/eigen3 -I/usr/include/ni -I/root/local/miniconda/lib/python3.6/site-packages/torch/lib/include -I/root/local/miniconda/lib/python3.6/site-packages/torch/lib/include/TH -I/root/local/miniconda/lib/python3.6/site-packages/torch/lib/include/THC -I/root/local/miniconda/include/python3.6m -c icp_op.cpp -o build/temp.linux-x86_64-3.6/icp_op.o -DTORCH_EXTENSION_NAME=icp_cpp -std=c++11
cc1plus: warning: command line option '-Wstrict-prototypes' is valid for C/ObjC but not for C++
creating build/lib.linux-x86_64-3.6
g++ -pthread -shared -B /root/local/miniconda/compiler_compat -L/root/local/miniconda/lib -Wl,-rpath=/root/local/miniconda/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.6/icp_op.o -lpcl_registration -lpcl_segmentation -lpcl_features -lpcl_surface -lpcl_tracking -lpcl_filters -lpcl_sample_consensus -lpcl_visualization -lpcl_io -lOpenNI -lpcl_search -lpcl_kdtree -lflann_cpp -lpcl_octree -lpcl_common -o build/lib.linux-x86_64-3.6/icp_cpp.cpython-36m-x86_64-linux-gnu.so -lboost_system
running develop
running egg_info
creating icp_cpp.egg-info
writing icp_cpp.egg-info/PKG-INFO
writing dependency_links to icp_cpp.egg-info/dependency_links.txt
writing top-level names to icp_cpp.egg-info/top_level.txt
writing manifest file 'icp_cpp.egg-info/SOURCES.txt'
reading manifest file 'icp_cpp.egg-info/SOURCES.txt'
writing manifest file 'icp_cpp.egg-info/SOURCES.txt'
running build_ext
copying build/lib.linux-x86_64-3.6/icp_cpp.cpython-36m-x86_64-linux-gnu.so ->
Creating /root/local/miniconda/lib/python3.6/site-packages/icp-cpp.egg-link (link to .)
Adding icp-cpp 1.0 to easy-install.pth file
Installed /home
Processing dependencies for icp-cpp==1.0
Finished processing dependencies for icp-cpp==1.0
(base) root@7354d80b0db7:/home# python
.git/ README.md icp.py icp_op.cpp
.gitignore __pycache__/ icp_cpp.cpython-36m-x86_64-linux-gnu.so icp_test.py
Dockerfile build/ icp_cpp.egg-info/ setup.py
(base) root@7354d80b0db7:/home# python
.git/ README.md icp.py icp_op.cpp
.gitignore __pycache__/ icp_cpp.cpython-36m-x86_64-linux-gnu.so icp_test.py
Dockerfile build/ icp_cpp.egg-info/ setup.py
(base) root@7354d80b0db7:/home# python icp_test.py
tensor([[[-1.2634e+00, -2.3912e-01, 3.1981e-01],
[-7.7116e-01, 3.9494e-02, -3.2341e-01],
[-2.0449e+00, 6.7875e-01, -9.5829e-01],
...,
[-1.1826e-01, -1.0028e+00, -8.6894e-02],
[ 2.6089e-01, -8.6151e-02, 3.6891e-01],
[ 1.4749e-01, 9.5050e-01, -4.9166e-01]]]) tensor([[[-1.6041, -0.4577, 0.8348],
[ 1.0041, -1.2082, -0.4258],
[ 0.1405, -1.9008, -1.2343],
...,
[-0.1316, -0.4407, -0.4610],
[ 1.6476, -1.3544, 0.5584],
[-0.1154, 0.6452, -0.3808]]])
Did you make any progress on your end?
Thank you @goldsborough ! I tried your docker. It really works!! I will check my environment and close the issue. Many thanks!
@onlytailei I am having the same issue. Were you able to resolve this problem? I am having segmentation fault at IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ> icp; I am using stable pytorch 1.1 and cuda-toolkit 10.0.
I am having the same issue. Were you able to resolve this problem? I am having segmentation fault at kdtree.setInputCloud(clouds);
I’m trying to build a cpp extension for point cloud iterative closest point using the icp function in pcl-1.7 http://pointclouds.org/documentation/tutorials/iterative_closest_point.php.
The data transforming from at::tensor to pcl::Pointcloud is fine. However, as soon as I declare a new icp object, there will be a segmentation fault.
I also tried to add more arguments to the CppExtension as https://github.com/strawlab/python-pcl/blob/master/setup.py. But it doesn’t help.
To repeat the bug, you can clone the related files from https://github.com/onlytailei/icp_extension. There should be pcl and eigen in the system
Then build the extension through:
Comment/Uncomment this line in icp_op.cpp.
And rebuild the extension, you will see the difference.