Closed egeldres closed 7 years ago
Hi egeldres.
That error means a crash in a CUDA kernel, the one called before line 400 in extract.cu: extractSliceKernel. It seems that my rewrite of the extraction procedure is causing trouble for many people. *sigh*
First of all:
Unfortunately, extractSliceKernel uses the templated extraction class in templated_extract.cuh. It won't be easy to debug.
I recall I had some success with cuda-memcheck
(link). Try adding
launch-prefix="cuda-memcheck "
to the node tag in the launch file. Maybe you will get more information. KinFu will run really slow - it is expected.
Otherwise, my usual approach is to comment out things until it stops crashing, and then try to guess why.
Hi RMonica,
thanks for your response, my cuda toolkit was failing so I corrected that and tried the cuda examples now working, but the Ros-Kinfu keeps crashing on the same stage. I'm using a Nvidia 960m whitch is 5.0 by the way.
I've tried to run the code now with the launch-prefix cuda memcheck, it was really slower but displayed some additional info I will put it here in case somebody can see trough the lines,.
Checking log directory for disk usage. This may take awhile. Press Ctrl-C to interrupt Done checking log file disk usage. Usage is <1GB.
started roslaunch server http://edu-G501JW:42270/
SUMMARY
PARAMETERS
Many thanks in advance in case you can help me a little more
Hi egeldres.
I don't know.
Try updating to CUDA 8.0 (from NVidia website) and see if it fixes anything.
Also, you may try this patch:
shift_tsdf_ptrstep.txt
(git apply shift_tsdf_ptrstep.txt
)
Those risky pointer operations in shift_tsdf_pointer always worried me.
Hi RMonica, many thanks for your feedback, I tried CUDA 8.0 on Ubuntu 16.04 but I think VTK6 is now difficulting my build, I was using 15.04 before. Are you using Ubuntu 14.04 or another?
Best regards
Hi egeldres.
I'm using Ubuntu 16.04, but with isolated compilation (catkin build
).
It seems that remove_vtk_definitions_hack.cmake is not working. See if this patch helps. cmake_current_source_dir.patch.txt
Hi RMonica,
thanks for your answer, I was using catkin_make before, now with catkin build, there are no vtk6 issues (just related warning now) but Cuda shows the following failure during the build of kinfu .
/usr/bin/ld: /home/edu16/catkin_ws/devel/.private/kinfu/lib/kinfu/kinfu: hidden symbol `cudaMallocPitch' in /usr/local/cuda-8.0/lib64/libcudart_static.a(libcudart_static.a.o) is referenced by DSO /usr/bin/ld: final link failed: Bad value collect2: error: ld returned 1 exit status make[2]: [/home/edu16/catkin_ws/devel/.private/kinfu/lib/kinfu/kinfu] Error 1 make[1]: [CMakeFiles/kinfu.dir/all] Error 2 make: *** [all] Error 2
Just that package is failing with and without the suggested patch, and there are 2 small warnings, saying io features related to pcap and png will be desabled, the log shows it's vtk6, but is just warning, the main complication it's cudaMallocPitch.
Hi egeldres.
Sometimes it happened to me, too. I was never able to debug it, because as soon as I changed anything in the CMakeLists it just disappeared. If you can reproduce the issue consistently, try to find a fix.
Hi RMonica, thanks for the last update https://github.com/RMonica/ros_kinfu/commit/f0393b838d4269c2875cc2887250f6f90481fed5 It fixed everything for me, now compiles fully without issues.
I'm having problems to get the full pointcloud, my request are just publishing 1 point per msg. But how to correctly use the project it's another thing. You can close this issue now, thanks a lot again.
Hi egeldres.
Only one point? That's not normal...
You are welcome. I'm closing this issue, then.
Hi, thank you for sharing this. My build is having problems to extract the pointcloud. I'm starting the server using: roslaunch kinfu kinfu.launch It launches well and start to build the tsdf, I can see the mesh image following the pose, using rviz without problems.
My requests works with ping only that is type: 0 or ping definition, I've used: rostopic pub /kinfu1_request_topic kinfu_msgs/KinfuTsdfRequest "tsdf_header: {request_type: 0, request_id: 1, request_source_name: '/response'}" using the full message and a node gives the same result.
[ INFO] [1483841989.524819173]: Avg frame time = 40.42 ms (24.74 fps)
Received response with 0 cloud points 0 mesh cloud points 0 triangles 0 pixels 0 uint64 values 0 float32 values
[ INFO] [1483841990.936224604]: Avg frame time = 41.94 ms (23.84 fps) [ INFO] [1483841992.327625026]: Avg frame time = 41.36 ms (24.18 fps)
I' think that is fine.
But when I'm sending request as: rostopic pub /kinfu1_request_topic kinfu_msgs/KinfuTsdfRequest "tsdf_header: {request_type: 3, request_id: 1, request_source_name: '/response'}" I've also tried using the full default message (request_reset: false, etc). And build a node to avoid terminal fails.
[ INFO] [1483840839.782851318]: Avg frame time = 40.42 ms (24.74 fps) [ INFO] [1483840841.142515084]: Avg frame time = 40.45 ms (24.72 fps) [ INFO] [1483840841.526482931]: kinfu: Extract Cloud Worker started. [ INFO] [1483840841.526525828]: kinfu: Locking kinfu... [ INFO] [1483840841.607188089]: kinfu: Locked. Extracting current volume...The old cube's metric origin was (0.000000, 0.000000, 0.000000). The new cube's metric origin is now (0.017582, 0.000866, 0.002557). Error: an illegal instruction was encountered /home/edu/catkin_ws/src/kinfu/pcl_kinfu_large_scale/kinfu_large_scale/src/cuda/extract.cu:400 [kinect_kinfu1-25] process has finished cleanly log file: /home/edu/.ros/log/a6663ff4-d53d-11e6-9612-28c2dd10d8b1/kinect_kinfu1-25*.log
Not sure what exactly fails, it's supposed to be a cuda code, but I'm not sure if I need to send some kind of command or set a parameter before extracting the PointCloud or anything different than ping.
I hope someone can give me some hints on how to resolve my problem, many thanks in advance.