Closed 67bug closed 1 year ago
Interesting... First, I would investigate if it's actually related to the architecture at all (arm64 vs amd64).... have you tried it with a regular desktop computer/laptop and it works nearly 100% of the times with those same parameters?
It would be helpful if you could share (dropbox/google drive/...) a ZIP with everything needed to reproduce: launch and config files + rosbag + launch instructions.
My feeling is that it's all related to tuning the uncertainty parameters of odometry. If failures are always near a curve, odometry normally is bad at those points, and we need either a larger uncertainty for rotations in the motion model, or a larger number of particles.
Another direct experiments you can try are:
KLD_minSampleSize=150
==> try larger values for the minimum number of particles, e.g. 300, 400, 500. LF_decimation=20
==> try smaller values, e.g. 15, 10. Most likely you will fix it with the number of particles, if it runs OK 90% of the time. If it always fail on curves, then updating uncertainty parameters should be required.
Hi Jose,
Thanks so much for your note. This is super helpful.
First things first, yes, this was happening only on the arm64 platform (a Jetson Xavier NX). For a tally of my own runs:
Clearly, the dataset itself was a bit biased, and I had assumed that the law of large numbers would suffice to draw reasonable statistics. So I decided to increase the x86 run count and lo and behold, I got two fails in the first six runs -- enough to eliminate my incorrect claim regarding differences between the arm64/amd64 platforms. [Please let me know if you would like me to change the title of the issue for future observers of this repo]
So I ran a tiny DOE to look for sensitivities:
DOE 1 (10 replicates each): Change KLD_minSampleSize
DOE 2 (10 replicates each): Change LF_Decimation
Trial 3 (30 replicates each): take the "best" of DOEs 1 and 2.
On the x86, where there was some variation from run to run, a combination of KLD_minSampleSize
of 400 and LF_Decimation
of 15 appears to be quite repeatable and accurate. However, the sensitivity to KLD_minSampleSize was quite low (between 150, 300 and 400, there was not much variation). I currently don't have a measurable means of comparing the replicates: the judgment is entirely visual. We need to come up with some means of quantifying the performance.
I repeated these 30 replicates on the Jetson Xavier with KLD_minSampleSize
150, 300 and 400 and LF_Decimation
15 and had zero errors.
One observation, the initial position (controlled by init_PDF parameters) seems to be perhaps the biggest variable in terms of localization errors when there is no motion at the beginning. Here are some screenshots: Good:
Not so good:
This error gets corrected within a few seconds of motion.
That said, taking a step back, the errors indeed are primarily when turns are made and highly exacerbated when sudden turns are made (to avoid dynamic obstacles). A factorial approach as I used above is painful at best and clearly, I am running somewhat blind. I'll take a look at your latest set of links in #125. Thank you!
Closing this as this is not an issue any more. Thanks for your help, @jlblancoc !
I am evaluating the
mrpt_localization
package and have run into a strange situation with repeatability.Every so often, mrpt_localization will result in a disturbingly different localization result than other times. Of course, the "different" result is wrong and occurs when we are running live in our test environment (Murphy still seems to have his way) and hence the issue. So we built up an offline setup where we can see the impact of parameter changes, repeated runs and such. This appears to happen only on arm64 and not on amd64.
Environment:
mrpt_localization
: 1.0.4 from source and also 1.0.3 from ros-distroI'll be happy to share the map pgm, mrpt_localization parameters and such to anyone interested:
Some screenshots from various runs are included below. Legend:
Correct runs:
Every now and then (roughly 10% of the time), incorrect runs:
How would one go about figuring out where to look to try to look for a root cause? The only (arguably unrelated issue) i can find that talks about arm64 and amd64 differences is this one.
Here is the
mrpt_config.ini
fileand the launch file calls out these parameter values:
If it helps any, if we use the
default_noise_phi
value of2.0
(which is the default), the localization is always incorrect -- this is how we had captured the original bag file. If we set it to0.5
, localization is clearly better, but runs into this repeatability issue@maxbader, we could use some of your guidance here