Closed liuzf1988 closed 1 year ago
I'm trying to reproduce your issue, but the build seems stuck at imu_corrector
. How long should I wait?
@haixuanTao Thanks. Has the compilation passed? According to my experience, there are two possibilities. One is that the first compilation really takes some time, about 20 minutes. Another possibility is that it will prompt "Blocking waiting for file lock on package cache", and at this time, you only need to temporarily disable the rust-analyzer plug-in and recompile the code. From the figure, it seems to be the first possibility.
It has been running for hours. I have cancelled the build and started again, but it does not seem to make any progress... Mind sharing on what environment are you running this dataflow? Linux, x86, cpu...
It has been running for hours. I have cancelled the build and started again, but it does not seem to make any progress... Mind sharing on what environment are you running this dataflow? Linux, x86, cpu...
cd dora/examples/autoware-dataflow/localization
source /opt/ros/galactic/setup.bash
cargo run --example localization-dataflow
build.rs
file of each Dora operator in "dora/examples/autoware-dataflow/localization/src/operators" contains code like println!("cargo:rerun-if-changed=src/imu_corrector.cpp");
which can trigger recompilation if the traced files (e.g. "imu_corrector.cpp") changed. However sometimes the "build", "log" and "install" folders generated by ROS2 compilation may be deleted manually by the user during debugging, resulting in println!("cargo:rerun-if-changed=src/xxx.xxx");
in build.rs
not triggering recompilation. At this time, we need to recompile the ROS2 side code explicitly. cd dora/examples/autoware-dataflow/localization
source /opt/ros/galactic/setup.bash
colcon build --symlink-install --cmake-args -DCMAKE_BUILD_TYPE=Release # optional
cargo run --example localization-dataflow
cd dora/examples/autoware-dataflow/localization
source install/setup.bash
../../../target/debug/dora-coordinator --run-dataflow dataflow.yml --runtime ../../../target/debug/dora-runtime
After discussion, we found the issue which was that the size of one buffer used to pass data in one node was not sized appropriately, making computation in the following node a lot more costly due to the bigger size.
The issue was not directly linked to Dora but having higher level types could potentially have helped not encounter this issue.
@haixuanTao Thanks for your great help, and now the error has been corrected. Actually, the issue is caused by incorrect understanding of the clear()
function of "stringstream". Taking "pointcloud_downsample.cpp" as an example, the original purpose of using ss.clear()
is to clear the buffer of ss, however the correct usage should be ss.str("")
or define a new "stringstream" object (e.g, ss2) to construct the output of Dora nodes or operators. I was misled by the function name of clear()
.
After porting NDT algorithm from ROS2 ecosystem to DORA framework, I conducted some tests on the performance of NDT algorithm. The results show that NDT execution is very time-consuming. I've been trying to find out the reason these days, but can't solve the problem until now. The following is the detailed debugging process.
NDT operator execution time on Dora side
First, add necessary print info in ndt_scan_matcher.cpp to obtain the time required for NDT execution.
Where
ndt_ptr->align(*output_cloud, initial_pose_matrix);
calls a multi-core version of NDT algorithm. This multi-core implementation actually uses the OpenMP based multi-threading mechanism. The corresponding parallel settings is in ndt_omp_impl.hpp, extracted as follows.The test results are shown in the figures below. It can be found that the NDT execution time is usually above 400 ms, far from meeting the real-time requirements of localization. And by using
htop
andpstree
commands, it is indicated that the multi-threading mechanism works properly.To find out which code is time-consuming to execute, I added more print info to "ndt_omp_impl.hpp", seeing the above code commented with "//" for details. The result shows that the execution time of the "computeDerivatives" function is about 70 ms, and this function will be called many times each time the NDT algorithm is run. Below is the corresponding screenshot.
NDT execution time on ROS2 side
For comparison, on ROS2 side, the screenshots of NDT execution time are as follows. It only takes about 0.7 ms for the "computeDerivatives" function to execute and the total NDT execution time is about 20 ms.
NDT node execution time on Dora side
Considering that the Dora operator itself is a thread, in order to eliminate the influence of this factor on the multi-threading mechanism of ndt, I rewrite the "ndt_scan_matcher" as Dora node and record its execution time. The relevant results are as follows:
It remains similar to the case of porting "ndt_scan_matcher" as Dora operator.
I carefully analyzed the code in ndt_omp_impl.hpp, and there seems nothing special. So it is strange that the same code only takes about 20 ms to execute on the ROS2 side, but it takes about 1000 ms on the Dora side. Looking forward to your advice.