Closed kosuke55 closed 1 year ago
cc @HiroIshida If you know anything about this, I would appreciate it if you could let me know.
1) As for RRT-star, the algorithm could occasionally fail at finding solution because of it's random nature. One workaround for this could be fixing the random seed of the planner.
2) As for the second issue, I have no idea. Is this issue happens only sometimes and usually no problem? Also, is this issue start to occur recently?
I can work on these issue, but I don't have time until 12/9.
Thanks for the replay. (and good luck with your thesis!)
- As for RRT-star, the algorithm could occasionally fail at finding a solution because of its random nature. One workaround for this could be fixing the random seed of the planner.
the random seed seems good.
- As for the second issue, I have no idea. Is this issue happens only sometimes and usually no problem? Also, is this issue start to occur recently?
It seems to happen occasionally, not every time. It has been seen since switching humble.
These tests are commented out temporary in https://github.com/autowarefoundation/autoware.universe/pull/2440. We need to fix them.
When I run the node on humble (in clean docker environment), I couldn't get any error.
When I compile the code with address sanitizer https://github.com/google/sanitizers/wiki/AddressSanitizer, it shows some memory leaks around rosbag2, and so there is a chance thatlibclass_loader.so
from class_loader package https://github.com/ros/class_loader is related to bug
Address sanitizer didn't show any memory error other than that.
h-ishida@03238e19ce66:~/autoware/src/universe/autoware.universe/planning/freespace_planning_algorithms$ ./build/freespace_planning_algorithms-test
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from AstarSearchTestSuite
[ RUN ] AstarSearchTestSuite.SingleCurvature
plan success : 582.046[msec], solution cost : 34.2057
[INFO] [1671047305.929521507] [rosbag2_storage]: Opened database '/tmp/fpalgos-astar_single-case0/fpalgos-astar_single-case0_0.db3' for READ_WRITE.
plan success : 5531.43[msec], solution cost : 33.5224
[INFO] [1671047311.723573118] [rosbag2_storage]: Opened database '/tmp/fpalgos-astar_single-case1/fpalgos-astar_single-case1_0.db3' for READ_WRITE.
plan success : 179.765[msec], solution cost : 37.5654
[INFO] [1671047312.022203706] [rosbag2_storage]: Opened database '/tmp/fpalgos-astar_single-case2/fpalgos-astar_single-case2_0.db3' for READ_WRITE.
plan success : 1588.64[msec], solution cost : 42.8882
[INFO] [1671047313.720065870] [rosbag2_storage]: Opened database '/tmp/fpalgos-astar_single-case3/fpalgos-astar_single-case3_0.db3' for READ_WRITE.
[ OK ] AstarSearchTestSuite.SingleCurvature (8651 ms)
[----------] 1 test from AstarSearchTestSuite (8652 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (8652 ms total)
[ PASSED ] 1 test.
=================================================================
==849==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x7f2278c341c7 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:99
#1 0x7f2273be6d2b (<unknown module>)
#2 0x7f227959147d (/lib64/ld-linux-x86-64.so.2+0x647d)
Indirect leak of 213 byte(s) in 4 object(s) allocated from:
#0 0x7f2278c341c7 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:99
#1 0x7f2277614e6e in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (/lib/x86_64-linux-gnu/libstdc++.so.6+0x14be6e)
Indirect leak of 152 byte(s) in 1 object(s) allocated from:
#0 0x7f2278c341c7 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:99
#1 0x7f2278a3f47d in class_loader::impl::AbstractMetaObjectBase::AbstractMetaObjectBase(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (/opt/ros/humble/lib/libclass_loader.so+0x947d)
Indirect leak of 8 byte(s) in 1 object(s) allocated from:
#0 0x7f2278c341c7 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:99
#1 0x7f2278a3d45a (/opt/ros/humble/lib/libclass_loader.so+0x745a)
SUMMARY: AddressSanitizer: 389 byte(s) leaked in 7 allocation(s).
class loader is used in rosbag2
, so probably, by removing rosbag2, the issue could be fixed.
h-ishida@03238e19ce66:~/autoware/src/universe/autoware.universe/planning/freespace_planning_algorithms/build$ ldd /opt/ros/humble/lib/librosbag2_storage.so
linux-vdso.so.1 (0x00007f2265015000)
libyaml-cpp.so.0.7 => /lib/x86_64-linux-gnu/libyaml-cpp.so.0.7 (0x00007f2264f40000)
libament_index_cpp.so => /opt/ros/humble/lib/libament_index_cpp.so (0x00007f2264f35000)
libclass_loader.so => /opt/ros/humble/lib/libclass_loader.so (0x00007f2264f22000)
librcpputils.so => /opt/ros/humble/lib/librcpputils.so (0x00007f2264f12000)
librcutils.so => /opt/ros/humble/lib/librcutils.so (0x00007f2264efa000)
libconsole_bridge.so.1.0 => /lib/x86_64-linux-gnu/libconsole_bridge.so.1.0 (0x00007f2264ef4000)
libtinyxml2.so.9 => /lib/x86_64-linux-gnu/libtinyxml2.so.9 (0x00007f2264edc000)
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f2264cb2000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f2264c92000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f2264a68000)
/lib64/ld-linux-x86-64.so.2 (0x00007f2265017000)
@kosuke55 In summary, I think either of the PR will fix the issue.
If the error is related to rosbag, then this PR fix https://github.com/autowarefoundation/autoware.universe/pull/2504
If the error is caused planning failed due to timeout, the following PR fix the issue https://github.com/autowarefoundation/autoware.universe/pull/2505 However, it is almost impossible that it takes more than 10 seconds to solve a problem by Astar, this PR probably will not fix this issue. However, if in the CI, test is done by multiprocessing or something like that, it could be possible that astar cannot use CPU enough, and it takes much much time than normal. Do you know the rostests in CI is running in a single process?
Checklist
Description
malloc(): invalid size (unsorted)
error occurs inAstarSearchTestSuite.SingleCurvature
Expected behavior
all tests pass
Actual behavior
tests fails
Steps to reproduce
run tests
Versions
No response
Possible causes
No response
Additional context
No response