Open ShepelIlya opened 2 years ago
@TheCodez can you check this out? There may be extra debug code somewhere. I create this PR manually deleting all changes that i need for my ROS node.
@ShepelIlya Thank you. I will take a look as soon as possible.
Nice -- I was seeing an occasional crash in CUDA but haven't been able to get to it. I ported your changes into my fork and things look stable with a small caveat -- I added a function to free the GridCellSoA in dogm.h/cu since the ROS functions/layers know nothing about CUDA and die a horrible death if cleanup isn't done. Looks like:
void DOGM::freeGridCells( GridCellsSoA grid_cells ) const
{
grid_cells.free();
}
Yeah, i found it too later, but i forgot to update this PR.
@idlebear have you seen a noticable runtime performance difference with SoA grid cells?
@ShepelIlya no worries, I will include a memory leak fix once I have some more time to integrate your changes.
So it's... slower? Here are the collected stats for a couple runs. I'm using a 100x100m grid with 0.25m resolution (cartesian) and 0.1m resolution (polar). Maybe extra allocs that are making the difference? Don't know without further digging. The third is without opengl...
----- original -- opengl -----------------------------------
After 1001 iterations: 8.48563mS per iteration
After 1001 iterations: 8.06899mS per iteration
------------------------------------------------------------
----- SoA -- opengl --- 0.25 grid --- 0.1 polar ------------------------
After 1001 iterations: 9.32966mS per iteration
After 1001 iterations: 9.7832mS per iteration
------------------------------------------------------------
----- SoA -- polar->cartesian --- 0.25 grid --- 0.1 polar ------------------------
After 1001 iterations: 4.74349mS per iteration
------------------------------------------------------------
This is the original view (video is in a non-standard res so it's probably squashed in your view):
Here's the SoA version (looks the same):
And for completeness, polar->cartesian:
I'm also seeing the artifacts in all versions...
I can post the debug and release binaries later today when I get a chance -- got a preferred delivery mechanism?
@idlebear I'm seeing no improvement either, which is unfortunate as I was hoping that this would get the performance closer to the paper.
As for the the trails and noise, I think it will get a lot better once you increase the number of particles.
Just upload the binaries here or in the discussion thread if possible.
Yeah, looking at the profiler output, of those ~10mS, the biggest wins seem to be in processing the texture and particle assignment. It's a bit different with the car in motion, but those numbers are still fairly representative,.
This changes comes from testing dogm on live robot at field conditions. I tried to roughly group all my changes into several commits. 1) SoA syntax, as you mentioned in #6 2) bugfix for particle prediction and for moving map when robot moves (Note: in my setup map orientation doesn't change with robot orientation) 3) a little memeory leak (found with compute-sanitazer from cuda tools) 4) i am using your code as part of my ROS node, so there is a little refactoring; I committed this just so you have my working version 5) test commit for #7