ros-perception / openslam_gmapping

218 stars 206 forks source link

Added multi-threaded scan matching using OpenMP #14

Open eybee opened 7 years ago

eybee commented 7 years ago

I profiled the gmapping process and found that the scan matching particle filter is very time consuming. It easily forces even more modern CPU cores to 100% load on larger particle numbers utilizing only one core. With the few changes I made, the scan matching process is now multi-threaded using the OpenMP library. This results in a much better performance of gmapping in total. I tested the package on three different platforms;

Cartograhger Turtlebot testbag laser scan:  
80 particles:

Ubuntu VM Laptop Intel i5:
single-threaded: Scan matching took: 1.59643
multi-threaded:  Scan matching took: 0.721148

Speedup = 2.21

Intel NUC Turtlebot:
single-threaded: Scan matching took: 4.32706
multi-threaded:  Scan matching took: 2.3869

Speedup = 1.81

20 particles:

PicoZed ARM v7:
single-threaded: Scan matching took: 5.19983
multi-threaded:  Scan matching took: 3.09518

Speedup = 1.68
eybee commented 7 years ago

Changed the things you mentioned.

eybee commented 7 years ago

If I use an openmp function, I do have to link against it of course.

eybee commented 7 years ago

I realized that I also need to keep the set(CXX_FLAGS) because no optimization is done otherwise.

alittleharry commented 7 years ago

hi, i found using openmp would result in core dump such as "corrupted double-linked list" more often, did you ever met that?

eybee commented 7 years ago

No, I never had that issue. Could you run valgrind on it and post the result?

alittleharry commented 7 years ago

*RESAMPLE*** Deleting Nodes: 7 11 18 26 29 Done Deleting old particles...Done Copying Particles and Registering scans... Done update frame 1808 update ld=1.04128 ad=0.0346855 Laser Pose= 121.1 37.684 -2.40904 m_count 185 Average Scan Matching Score=333.415 Average Scan Matching Time=0.143367s neff= 28.2199 Registering Scans:[slam_gmapping-1] process has died [pid 8614, exit code -9, cmd /mnt/hgfs/catkin_tool/devel/lib/gmapping/slam_gmapping scan:=velodyne_2d_laserscan __name:=slam_gmapping __log:=/home/alittleharry/.ros/log/0817e1f4-bae4-11e6-8ead-000c297eec31/slam_gmapping-1.log]. log file: /home/harry/.ros/log/0817e1f4-bae4-11e6-8ead-000c297eec31/slam_gmapping-1*.log==8595== Thread 2: ==8595== Invalid read of size 4 ==8595== at 0x5506BE: PyObject_Free (in /usr/bin/python2.7) ==8595== by 0x4F57A3: PyFile_WriteObject (in /usr/bin/python2.7) ==8595== by 0x436A3D: ??? (in /usr/bin/python2.7) ==8595== by 0x49EC75: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595== by 0x49AB44: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595== by 0x49AB44: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595== by 0x49AB44: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x499EF1: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A1C99: ??? (in /usr/bin/python2.7) ==8595== Address 0x95ba020 is 528 bytes inside a block of size 1,449 free'd ==8595== at 0x4C2CB8A: realloc (vg_replace_malloc.c:785) ==8595== by 0x52D088: _PyString_Resize (in /usr/bin/python2.7) ==8595== by 0x510770: PyUnicodeUCS4_EncodeUTF8 (in /usr/bin/python2.7) ==8595== by 0x4BD3FE: ??? (in /usr/bin/python2.7) ==8595== by 0x4E2CD1: PyCodec_Encode (in /usr/bin/python2.7) ==8595== by 0x5B6BB0: PyUnicodeUCS4_AsEncodedString (in /usr/bin/python2.7) ==8595== by 0x4F5839: PyFile_WriteObject (in /usr/bin/python2.7) ==8595== by 0x436A3D: ??? (in /usr/bin/python2.7) ==8595== by 0x49EC75: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595== by 0x49AB44: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595== Block was alloc'd at ==8595== at 0x4C2AC3D: malloc (vg_replace_malloc.c:299) ==8595== by 0x52A287: PyString_FromStringAndSize (in /usr/bin/python2.7) ==8595== by 0x5107A2: PyUnicodeUCS4_EncodeUTF8 (in /usr/bin/python2.7) ==8595== by 0x4BD3FE: ??? (in /usr/bin/python2.7) ==8595== by 0x4E2CD1: PyCodec_Encode (in /usr/bin/python2.7) ==8595== by 0x5B6BB0: PyUnicodeUCS4_AsEncodedString (in /usr/bin/python2.7) ==8595== by 0x4F5839: PyFile_WriteObject (in /usr/bin/python2.7) ==8595== by 0x436A3D: ??? (in /usr/bin/python2.7) ==8595== by 0x49EC75: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595== by 0x49AB44: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595==

all processes on machine have died, roslaunch will exit shutting down processing monitor... ... shutting down processing monitor complete done ==8595== Thread 1: ==8595== Invalid read of size 4 ==8595== at 0x5506BE: PyObject_Free (in /usr/bin/python2.7) ==8595== by 0x507EAF: PyDict_SetItem (in /usr/bin/python2.7) ==8595== by 0x4F7F70: _PyModule_Clear (in /usr/bin/python2.7) ==8595== by 0x4C7CAD: PyImport_Cleanup (in /usr/bin/python2.7) ==8595== by 0x437D4B: Py_Finalize (in /usr/bin/python2.7) ==8595== by 0x44F992: Py_Main (in /usr/bin/python2.7) ==8595== by 0x5076F44: (below main) (libc-start.c:287) ==8595== Address 0x682f020 is 30,592 bytes inside a block of size 32,816 free'd ==8595== at 0x4C2BD57: free (vg_replace_malloc.c:530) ==8595== by 0x5111F2C: closedir (closedir.c:50) ==8595== by 0x4CDAE1: ??? (in /usr/bin/python2.7) ==8595== by 0x49968C: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A2D2D: ??? (in /usr/bin/python2.7) ==8595== by 0x49990E: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A2CA3: ??? (in /usr/bin/python2.7) ==8595== by 0x49990E: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A2CA3: ??? (in /usr/bin/python2.7) ==8595== by 0x49990E: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x499EF1: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x499EF1: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== Block was alloc'd at ==8595== at 0x4C2AC3D: malloc (vg_replace_malloc.c:299) ==8595== by 0x5111DF0: __alloc_dir (opendir.c:207) ==8595== by 0x4CD915: ??? (in /usr/bin/python2.7) ==8595== by 0x49968C: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A2D2D: ??? (in /usr/bin/python2.7) ==8595== by 0x49990E: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A2CA3: ??? (in /usr/bin/python2.7) ==8595== by 0x49990E: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A2CA3: ??? (in /usr/bin/python2.7) ==8595== by 0x49990E: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x499EF1: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x499EF1: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== ==8595== Invalid read of size 4 ==8595== at 0x5506BE: PyObject_Free (in /usr/bin/python2.7) ==8595== by 0x4F792B: ??? (in /usr/bin/python2.7) ==8595== by 0x507EAF: PyDict_SetItem (in /usr/bin/python2.7) ==8595== by 0x4F7F70: _PyModule_Clear (in /usr/bin/python2.7) ==8595== by 0x4C7CAD: PyImport_Cleanup (in /usr/bin/python2.7) ==8595== by 0x437D4B: Py_Finalize (in /usr/bin/python2.7) ==8595== by 0x44F992: Py_Main (in /usr/bin/python2.7) ==8595== by 0x5076F44: (below main) (libc-start.c:287) ==8595== Address 0x5eec020 is 0 bytes inside a block of size 722 free'd ==8595== at 0x4C2BD57: free (vg_replace_malloc.c:530) ==8595== by 0x4F7B5E: ??? (in /usr/bin/python2.7) ==8595== by 0x4F78C3: ??? (in /usr/bin/python2.7) ==8595== by 0x507EAF: PyDict_SetItem (in /usr/bin/python2.7) ==8595== by 0x4F7F70: _PyModule_Clear (in /usr/bin/python2.7) ==8595== by 0x4C7CAD: PyImport_Cleanup (in /usr/bin/python2.7) ==8595== by 0x437D4B: Py_Finalize (in /usr/bin/python2.7) ==8595== by 0x44F992: Py_Main (in /usr/bin/python2.7) ==8595== by 0x5076F44: (below main) (libc-start.c:287) ==8595== Block was alloc'd at ==8595== at 0x4C2AC3D: malloc (vg_replace_malloc.c:299) ==8595== by 0x52A287: PyString_FromStringAndSize (in /usr/bin/python2.7) ==8595== by 0x523BB8: ??? (in /usr/bin/python2.7) ==8595== by 0x523EBF: ??? (in /usr/bin/python2.7) ==8595== by 0x523DF7: ??? (in /usr/bin/python2.7) ==8595== by 0x523ED5: ??? (in /usr/bin/python2.7) ==8595== by 0x5AAD55: PyMarshal_ReadObjectFromString (in /usr/bin/python2.7) ==8595== by 0x5AADFF: PyMarshal_ReadLastObjectFromFile (in /usr/bin/python2.7) ==8595== by 0x5AAE3D: ??? (in /usr/bin/python2.7) ==8595== by 0x5B1EC4: ??? (in /usr/bin/python2.7) ==8595== by 0x540947: ??? (in /usr/bin/python2.7) ==8595== by 0x540D07: ??? (in /usr/bin/python2.7) ==8595== ==8595== Invalid read of size 4 ==8595== at 0x57398A: PyObject_GC_Del (in /usr/bin/python2.7) ==8595== by 0x4F6DB6: ??? (in /usr/bin/python2.7) ==8595== by 0x4B8C97: ??? (in /usr/bin/python2.7) ==8595== by 0x507EAF: PyDict_SetItem (in /usr/bin/python2.7) ==8595== by 0x4F7F70: _PyModule_Clear (in /usr/bin/python2.7) ==8595== by 0x4C7D60: PyImport_Cleanup (in /usr/bin/python2.7) ==8595== by 0x437D4B: Py_Finalize (in /usr/bin/python2.7) ==8595== by 0x44F992: Py_Main (in /usr/bin/python2.7) ==8595== by 0x5076F44: (below main) (libc-start.c:287) ==8595== Address 0x6c11020 is 16 bytes after a block of size 16 free'd ==8595== at 0x4C2BD57: free (vg_replace_malloc.c:530) ==8595== by 0x55AD14: ??? (in /usr/bin/python2.7) ==8595== by 0x6C6B0AA: ??? (in /usr/lib/python2.7/lib-dynload/_elementtree.x86_64-linux-gnu.so) ==8595== by 0x6C6AEE8: ??? (in /usr/lib/python2.7/lib-dynload/_elementtree.x86_64-linux-gnu.so) ==8595== by 0x6C6B063: ??? (in /usr/lib/python2.7/lib-dynload/_elementtree.x86_64-linux-gnu.so) ==8595== by 0x55A656: ??? (in /usr/bin/python2.7) ==8595== by 0x4F6D86: ??? (in /usr/bin/python2.7) ==8595== by 0x49975A: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x499EF1: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x499EF1: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x499EF1: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595== Block was alloc'd at ==8595== at 0x4C2AC3D: malloc (vg_replace_malloc.c:299) ==8595== by 0x52FCD6: PyList_New (in /usr/bin/python2.7) ==8595== by 0x6C67F86: ??? (in /usr/lib/python2.7/lib-dynload/_elementtree.x86_64-linux-gnu.so) ==8595== by 0x6C65E57: ??? (in /usr/lib/python2.7/lib-dynload/_elementtree.x86_64-linux-gnu.so) ==8595== by 0x70CBF69: ??? (in /lib/x86_64-linux-gnu/libexpat.so.1.6.0) ==8595== by 0x70CC64D: ??? (in /lib/x86_64-linux-gnu/libexpat.so.1.6.0) ==8595== by 0x70CA9E0: ??? (in /lib/x86_64-linux-gnu/libexpat.so.1.6.0) ==8595== by 0x70CB16C: ??? (in /lib/x86_64-linux-gnu/libexpat.so.1.6.0) ==8595== by 0x70CE5DE: XML_ParseBuffer (in /lib/x86_64-linux-gnu/libexpat.so.1.6.0) ==8595== by 0x6C6461C: ??? (in /usr/lib/python2.7/lib-dynload/_elementtree.x86_64-linux-gnu.so) ==8595== by 0x49968C: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595== ==8595== Invalid read of size 4 ==8595== at 0x55EAA2: PyGrammar_RemoveAccelerators (in /usr/bin/python2.7) ==8595== by 0x437DB6: Py_Finalize (in /usr/bin/python2.7) ==8595== by 0x44F992: Py_Main (in /usr/bin/python2.7) ==8595== by 0x5076F44: (below main) (libc-start.c:287) ==8595== Address 0x5fd5020 is 32 bytes inside a block of size 112 free'd ==8595== at 0x4C2BD57: free (vg_replace_malloc.c:530) ==8595== by 0x55AD14: ??? (in /usr/bin/python2.7) ==8595== by 0x49AA0E: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595== by 0x499A51: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595== by 0x499A51: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595== by 0x499A51: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595== by 0x499A51: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595== Block was alloc'd at ==8595== at 0x4C2AC3D: malloc (vg_replace_malloc.c:299) ==8595== by 0x52FCD6: PyList_New (in /usr/bin/python2.7) ==8595== by 0x530244: ??? (in /usr/bin/python2.7) ==8595== by 0x49A3B4: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595== by 0x499A51: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595== by 0x499A51: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595== by 0x499A51: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== by 0x4A090B: PyEval_EvalCodeEx (in /usr/bin/python2.7) ==8595== by 0x499A51: PyEval_EvalFrameEx (in /usr/bin/python2.7) ==8595== ==8595== ==8595== HEAP SUMMARY: ==8595== in use at exit: 2,658,373 bytes in 5,034 blocks ==8595== total heap usage: 162,240 allocs, 157,206 frees, 275,080,482 bytes allocated ==8595== ==8595== LEAK SUMMARY: ==8595== definitely lost: 0 bytes in 0 blocks ==8595== indirectly lost: 0 bytes in 0 blocks ==8595== possibly lost: 30,720 bytes in 54 blocks ==8595== still reachable: 2,627,653 bytes in 4,980 blocks ==8595== suppressed: 0 bytes in 0 blocks ==8595== Rerun with --leak-check=full to see details of leaked memory ==8595== ==8595== For counts of detected and suppressed errors, rerun with: -v ==8595== Use --track-origins=yes to see where uninitialised values come from ==8595== ERROR SUMMARY: 9220 errors from 120 contexts (suppressed: 0 from 0)

KleinYuan commented 6 years ago

Any follow ups? @vrabaud @alittleharry

JohannesBetz commented 6 years ago

Hey everyone, are there any news about the Multi Thread Integration? I am wondering, if the branch from eybee is finished? I cloned the repo and did some tests, code is working so far but i dont see much improvement. So what is the status of the Multi Thread Project?