RMonica / ros_kinfu

30 stars 18 forks source link

extracting anything but ping fails #4

Closed egeldres closed 7 years ago

egeldres commented 7 years ago

Hi, thank you for sharing this. My build is having problems to extract the pointcloud. I'm starting the server using: roslaunch kinfu kinfu.launch It launches well and start to build the tsdf, I can see the mesh image following the pose, using rviz without problems.

My requests works with ping only that is type: 0 or ping definition, I've used: rostopic pub /kinfu1_request_topic kinfu_msgs/KinfuTsdfRequest "tsdf_header: {request_type: 0, request_id: 1, request_source_name: '/response'}" using the full message and a node gives the same result.

[ INFO] [1483841989.524819173]: Avg frame time = 40.42 ms (24.74 fps)

Received response with 0 cloud points 0 mesh cloud points 0 triangles 0 pixels 0 uint64 values 0 float32 values

[ INFO] [1483841990.936224604]: Avg frame time = 41.94 ms (23.84 fps) [ INFO] [1483841992.327625026]: Avg frame time = 41.36 ms (24.18 fps)

I' think that is fine.

But when I'm sending request as: rostopic pub /kinfu1_request_topic kinfu_msgs/KinfuTsdfRequest "tsdf_header: {request_type: 3, request_id: 1, request_source_name: '/response'}" I've also tried using the full default message (request_reset: false, etc). And build a node to avoid terminal fails.

[ INFO] [1483840839.782851318]: Avg frame time = 40.42 ms (24.74 fps) [ INFO] [1483840841.142515084]: Avg frame time = 40.45 ms (24.72 fps) [ INFO] [1483840841.526482931]: kinfu: Extract Cloud Worker started. [ INFO] [1483840841.526525828]: kinfu: Locking kinfu... [ INFO] [1483840841.607188089]: kinfu: Locked. Extracting current volume...The old cube's metric origin was (0.000000, 0.000000, 0.000000). The new cube's metric origin is now (0.017582, 0.000866, 0.002557). Error: an illegal instruction was encountered /home/edu/catkin_ws/src/kinfu/pcl_kinfu_large_scale/kinfu_large_scale/src/cuda/extract.cu:400 [kinect_kinfu1-25] process has finished cleanly log file: /home/edu/.ros/log/a6663ff4-d53d-11e6-9612-28c2dd10d8b1/kinect_kinfu1-25*.log

Not sure what exactly fails, it's supposed to be a cuda code, but I'm not sure if I need to send some kind of command or set a parameter before extracting the PointCloud or anything different than ping.

I hope someone can give me some hints on how to resolve my problem, many thanks in advance.

RMonica commented 7 years ago

Hi egeldres.

That error means a crash in a CUDA kernel, the one called before line 400 in extract.cu: extractSliceKernel. It seems that my rewrite of the extraction procedure is causing trouble for many people. *sigh*

First of all:

Unfortunately, extractSliceKernel uses the templated extraction class in templated_extract.cuh. It won't be easy to debug.

I recall I had some success with cuda-memcheck (link). Try adding

launch-prefix="cuda-memcheck "

to the node tag in the launch file. Maybe you will get more information. KinFu will run really slow - it is expected.

Otherwise, my usual approach is to comment out things until it stops crashing, and then try to guess why.

egeldres commented 7 years ago

Hi RMonica,

thanks for your response, my cuda toolkit was failing so I corrected that and tried the cuda examples now working, but the Ros-Kinfu keeps crashing on the same stage. I'm using a Nvidia 960m whitch is 5.0 by the way.

I've tried to run the code now with the launch-prefix cuda memcheck, it was really slower but displayed some additional info I will put it here in case somebody can see trough the lines,.


Checking log directory for disk usage. This may take awhile. Press Ctrl-C to interrupt Done checking log file disk usage. Usage is <1GB.

started roslaunch server http://edu-G501JW:42270/

SUMMARY

PARAMETERS

Many thanks in advance in case you can help me a little more

RMonica commented 7 years ago

Hi egeldres.

I don't know.

Try updating to CUDA 8.0 (from NVidia website) and see if it fixes anything.

Also, you may try this patch: shift_tsdf_ptrstep.txt (git apply shift_tsdf_ptrstep.txt) Those risky pointer operations in shift_tsdf_pointer always worried me.

egeldres commented 7 years ago

Hi RMonica, many thanks for your feedback, I tried CUDA 8.0 on Ubuntu 16.04 but I think VTK6 is now difficulting my build, I was using 15.04 before. Are you using Ubuntu 14.04 or another?

Best regards

RMonica commented 7 years ago

Hi egeldres.

I'm using Ubuntu 16.04, but with isolated compilation (catkin build).

It seems that remove_vtk_definitions_hack.cmake is not working. See if this patch helps. cmake_current_source_dir.patch.txt

egeldres commented 7 years ago

Hi RMonica,

thanks for your answer, I was using catkin_make before, now with catkin build, there are no vtk6 issues (just related warning now) but Cuda shows the following failure during the build of kinfu .

/usr/bin/ld: /home/edu16/catkin_ws/devel/.private/kinfu/lib/kinfu/kinfu: hidden symbol `cudaMallocPitch' in /usr/local/cuda-8.0/lib64/libcudart_static.a(libcudart_static.a.o) is referenced by DSO /usr/bin/ld: final link failed: Bad value collect2: error: ld returned 1 exit status make[2]: [/home/edu16/catkin_ws/devel/.private/kinfu/lib/kinfu/kinfu] Error 1 make[1]: [CMakeFiles/kinfu.dir/all] Error 2 make: *** [all] Error 2

Just that package is failing with and without the suggested patch, and there are 2 small warnings, saying io features related to pcap and png will be desabled, the log shows it's vtk6, but is just warning, the main complication it's cudaMallocPitch.

RMonica commented 7 years ago

Hi egeldres.

Sometimes it happened to me, too. I was never able to debug it, because as soon as I changed anything in the CMakeLists it just disappeared. If you can reproduce the issue consistently, try to find a fix.

egeldres commented 7 years ago

Hi RMonica, thanks for the last update https://github.com/RMonica/ros_kinfu/commit/f0393b838d4269c2875cc2887250f6f90481fed5 It fixed everything for me, now compiles fully without issues.

I'm having problems to get the full pointcloud, my request are just publishing 1 point per msg. But how to correctly use the project it's another thing. You can close this issue now, thanks a lot again.

RMonica commented 7 years ago

Hi egeldres.

Only one point? That's not normal...

You are welcome. I'm closing this issue, then.