RMonica / ros_kinfu

30 stars 18 forks source link

running problem with prime sense #2

Open t-lou opened 8 years ago

t-lou commented 8 years ago

Hi, I have been trying to use this package(forked one) on prime sense, but have got a problem for response. There is only output for TSDF, but empty response for all other requests. I have traced the problem to ros_kinfu/kinfu/pcl_kinfu_large_scale/kinfu_large_scale/src/cuda/templated_extract.cuh/get_result_size, where both overestimated and subtraction are always zero(even with other pregiven value). Some advice what the problem is? Thanks!

RMonica commented 8 years ago

Hi t-lou.

That is weird. It works for me.

About a year ago, I had to rewrite most of the extraction from TSDF volume, because it used warp-synchronous programming, which is no longer supported by CUDA. The extract.cu and extract.cuh files were the most affected. I can't exclude that I made some mistakes and there is some race condition hidden somewhere.

In the part of the message you edited out, you mention that you are using a Fermi video card. That is a quite old architecture. The oldest architecture I used is Kepler (Compute Capability 3.0).

Another possibility is that you can't download any data because there is no data in the TSDF volume. Are you sure the input topic is connected? Can you see the synthetic depth map on /kinfu_current_view? When you download the points from the TSDF volume, are they all in unknown status (intensity value = 0)?

lou-magazino commented 8 years ago

Hi Monica,

I'm t-lou and this is my account for work.

I found that TSDF has full size of regex{120+}(many zeros, but setting size to it would lead to to memery leak). kinfu view from ImagePublisher looks good. There are 56678 tsdf points for output with i from -1 to 1, so i guess tsdf is fine. I just checked the order of init_globals and get_result_size, it is always init_globals->kernel->get_result_size. Now checking the two counters which are used only in atomicAdd. Some ideas?

About THint, are only HINT_TYPE_FORCED and HINT_TYPE_NONE valid, which uses tf from request and which uses only tf from kinfu_large_scale? I'm a bit confused about this part.

Thank you so much for your reply! Tongxi

PS here is a screenshot. generateTrianglesWithNormals looks like the only problem till now. screenshot from 2016-11-08 12 51 26

lou-magazino commented 8 years ago

The new problem can be that FullScan6::templatedExtract is only not called in generateTrianglesWithNormals, even if I set counts to some value and return directly. This function is okay in IncompletePointsExtractor::extractSliceAsCloud. Besides, TrianglesExtractor::tranc_dist is never used. Are they related? Thanks

RMonica commented 8 years ago

Hi t-lou.

The TSDF volume seems fine.

I can't reproduce the problem. If you find the bug, please let me know.

To be honest, I find your last comment hard to understand. I am not surprised if there are some unused variables. The KinFu code was never very clean to begin with.

For your previous question:

lou-magazino commented 8 years ago

Hi RMonica,

I tried to change to kernel to something like "if(called by generateTrianglesWithNormals) then {change count_overestimate.. to 100}", but the value is still 0(both counters). So my guess is that the kernel is not called at all. But not yet confirmed because I still only have GPU with arch 2.1.

Is your GPU with architecture 3.5+? Maybe TSDF matching is a adaptive parallel process because of searching, then the least hw requirement would be arch=3.5(https://devblogs.nvidia.com/parallelforall/introduction-cuda-dynamic-parallelism/). In kinfu_large_scale alone, the tf is often lost even when I keep low velocity. Could it be caused by incomplete matching or is it normal?

I have made some changes in CMakeLists which install the packages. Maybe they can be interesting for you. Thank you for you help!

t-lou

RMonica commented 8 years ago

Hi t-lou.

The oldest GPU I ran the node on is a GeForce GTX 670 (Kepler, Compute Capability 3.0). It still works today. I'm quite sure KinFu does not use Dynamic Parallelism. However, I may have used some features of Compute Capability 3.0 without noticing.

It seems that your issue is with the Marching Cubes algorithm, which calls generateTrianglesWithNormals. The node attempts to run Marching Cubes in parallel with normal KinFu execution. Maybe it is not supported by your video card. Try sending a COMMAND_TYPE_SUSPEND to KinFu before the request. Also, check that you have enough video memory for two TSDF volumes at once.

KinFu tracking isn't very good. Also, you are using an old video card, which degrades performance. It also depends on the scene you are trying to scan.

I had a look at your CMakeLists. There are mostly cosmetic changes, but you also added installation. I will look into it, if I have time.

P.S.: I just pushed an update. It is just refactoring (I merged kinfu and kinfu_output). It shouldn't affect your issue.

lou-magazino commented 8 years ago

Hi RMonica,

I have tried to suspend before "generateTrianglesWithNormals" and pause for 1 or 2 seconds between operations, but it is not helping. I just recalled that the tracking is always lost after moving for 2~5 meters(volume size 256x256x256), even if I move slowly and steadily and avoid flat huge surroundings. You just said the problem can be in Marching Cubes algorithm. Is lost frame also related to it? I have tried volume size 32, 64, 128, 256 and 384, so memory should not be the problem(there is 900+ MB)

Yeah, my main modification is about installation and support for arch 2.1. Other stuff is not important.

Thanks a lot for your time! t-lou

RMonica commented 8 years ago

Hi t-lou.

When you move outside the TSDF volume, which is about 3m wide by default, kinfu_large_scale attempts to download slices of it to CPU RAM, in order to free GPU memory and continue 3D reconstruction. This operation is called "shifting". That is the "large scale" part of kinfu_large_scale: KinFu, by itself, would not reconstruct outside the initial TSDF volume. If tracking is consistently lost when a shifting occurs, then probably the shifting procedure is not working properly. Incidentally, shifting relies on extractSliceAsCloud as well - are you sure it is working?

I think that old CUDA compilers were unreliable when compiling templated code. Is your CUDA driver/toolkit updated?

Is there any chance you can test on a more modern video card?

I don't have any other ideas.

You are welcome.

lou-magazino commented 8 years ago

Hi RMonica,

grazie for the explanation, I always thought all is stored in GPU or file.

I'm quite sure extractSliceAsCloud works well. As far as I understand, the bounding box relies also on it and that part is okay. Actually very often the program processes the scene in 2~3 cubes, but all fail in generateTrianglesWithNormals.

I'm using CUDA 8.0 from Nvidia, so driver and library should be okay. I'm trying to use newer GPU.

I just wrote a minimal node to run original kinfu-large-scale(at maximal with lower resolution of cube). Compared with your thread, shifting is slower but retracking when lost is possible(when I use your package, once the track is lost, the program won't run anymore even if I move the camera back). The original one might also track better(for longer distance and is more robust against planes). Is it because of hardware?

RMonica commented 8 years ago

Hi t-lou.

It is possible that the original kinfu_large_scale tracking works better. After all, I changed many things, over time, for many reasons. And most of the time I use external tracking anyway. Also, many say that the PCL KinFu implementation itself is outdated and better, more efficient implementations are available. Use the tool that works better for you.

I removed the tracking recovery because it looked more like a misfeature to me. KinFu will often notice lost tracking too late. The reconstruction will already be corrupted by then. From my point of view, an error message is a more reliable result.

If I remember properly, you can resume after a lost tracking by sending a COMMAND_TYPE_NONE, with the current sensor pose as hint. The node will attempt to resume from that pose.

RMonica commented 8 years ago

Hi t-lou.

I specified in the INSTALL.txt file that the minimum compute capability is 3.0. I will not add support for 2.x, because it is deprecated and it is going to be removed by NVidia anyway.

I have investigated the bad tracking, and I discovered a bug, caused by my changes to the ray casting algorithm. Thanks for the bug report. Sorry for the mistake on my part.

I updated the repository. If you are still interested, can you test the fix?

I hope I did not break anything else.

lou-magazino commented 8 years ago

Hi RMonica,

sure, I'll test it later, after maybe calibrating the camera for long range. Do you have bad distortion after around 3m? If so, how do you correct it?

t-lou

RMonica commented 8 years ago

Hi t-lou.

I haven't noticed any distortion, but I rarely reconstruct at that distance. Also, I use Kinect v2, which likely has a different set of problems because of time-of-flight technology.

lou-magazino commented 8 years ago

Hi RMonica,

I got this message when I try to compile with Ubuntu 16.04 and cuda 8.0. Some ideas? Compiling my node is okay, compiling on Ubuntu 14.04 and cuda 8.0 is also okay(also there is driver problem). Thanks!

`nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). nvcc fatal : A single input file is required for a non-link phase when an outputfile is specified CMake Error at KinfuLargeScaleCUDA_generated_tsdf_volume.cu.o.cmake:207 (message): Error generating /home/tlou/catkin_ws/build/ros_kinfu/kinfu/pcl_kinfu_large_scale/CMakeFiles/KinfuLargeScaleCUDA.dir/kinfu_large_scale/src/cuda/./KinfuLargeScaleCUDA_generated_tsdf_volume.cu.o

ros_kinfu/kinfu/pcl_kinfu_large_scale/CMakeFiles/KinfuLargeScaleCUDA.dir/build.make:154: recipe for target 'ros_kinfu/kinfu/pcl_kinfu_large_scale/CMakeFiles/KinfuLargeScaleCUDA.dir/kinfu_large_scale/src/cuda/KinfuLargeScaleCUDA_generated_tsdf_volume.cu.o' failed make[2]: [ros_kinfu/kinfu/pcl_kinfu_large_scale/CMakeFiles/KinfuLargeScaleCUDA.dir/kinfu_large_scale/src/cuda/KinfuLargeScaleCUDA_generated_tsdf_volume.cu.o] Error 1 CMakeFiles/Makefile2:3940: recipe for target 'ros_kinfu/kinfu/pcl_kinfu_large_scale/CMakeFiles/KinfuLargeScaleCUDA.dir/all' failed make[1]: [ros_kinfu/kinfu/pcl_kinfu_large_scale/CMakeFiles/KinfuLargeScaleCUDA.dir/all] Error 2 make[1]: *** Waiting for unfinished jobs.... [ 2%] Built target _kinfu_msgs_generate_messages_check_deps_RequestResult [ 2%] Built target _kinfu_msgs_generate_messages_check_deps_KinfuPose `

PS: Kinect uses pattern for depth, but yeah the distortion is small and quite ideal. You are lucky not to use PrimeSense for it :D

RMonica commented 8 years ago

Hi t-lou.

It seems that you ran into this bug: https://github.com/PointCloudLibrary/pcl/issues/776 which is weird, since I already implemented the suggested hack in https://github.com/RMonica/ros_kinfu/blob/master/kinfu/pcl_kinfu_large_scale/remove_vtk_definitions_hack.cmake and it works for me.

Use catkin_make VERBOSE=1. You should be able to see the actual command which calls nvcc. Check if there are other definitions or compiler flags that must be removed, then remove them.

Also, you may try isolated compilation ("catkin_make_isolated" or "catkin build").