JuliaRobotics / Caesar.jl

Robust robotic localization and mapping, together with NavAbility(TM). Reach out to info@wherewhen.ai for help.
https://www.wherewhen.ai
MIT License
186 stars 31 forks source link

Odometry disconnect at the end of long chains #382

Open mc2922 opened 5 years ago

mc2922 commented 5 years ago

Odometry chains break towards the end of a longer chain?

Range-only factors, 190 Point2: ex2-1.pdf ex2-2.pdf ex2-2

Range-only factors, 200 Point2: ex1-1.pdf ex1-2.pdf ex1-2

See example script here, with minimum number 171 point2s: https://github.com/JuliaRobotics/Caesar.jl/blob/dynSAS/examples/marine/asv/kayaks/testRangeOnlyOdo.jl

Bayes Tree: bt.pdf

ex3-2

dehann commented 5 years ago

Depiction of the Bayes tree that associated with this issue. The short branch has a prior on 'X', but that prior information is not properly distributed down the long sibling branch.

IMG_1675

dehann commented 5 years ago

another anecdotal piece of information is that the problem also occurred on a single long branch.

dehann commented 5 years ago

Think this may be related to JuliaRobotics/IncrementalInference.jl#244 which has been postponed for the DFG and CSM upgrades.

dehann commented 5 years ago

From the photo above, the triangle is x169, square is x170. When looking at the clique low down on right hand branch (x169), the following subgraph is obtained. Note priors represent the downward belief messages during the final stages of the complete solve. This is where the disconnect in odo [x169 -- x170] seems to be occurring:

Screenshot from 2019-09-02 12-33-06

dehann commented 5 years ago

All these loose fragments should also be computed in parallel btw.

dehann commented 5 years ago

Ha, a little more progress. By looking closer I found that odometry factor x168--x169 is working the wrong way round for some reason, Screenshot from 2019-09-02 23-35-02

and then found this. The order of fields x168 and x169 is important here, but clearly wrong: Screenshot from 2019-09-02 23-37-58

@GearsAD , do you have any ideas here -- think we spoke a little while back on this but I cannot recall what we decided to do about fncargvID?

Values in factor seem fine: Screenshot from 2019-09-02 23-35-31

Oh, and the factor graph here sfg31bd stands for sub factor graph inside cliq 31 before the down solve is performed. Screenshot from 2019-09-02 23-44-33

dehann commented 5 years ago

this does not seem to be present in the original factor graph: Screenshot from 2019-09-02 23-51-44

dehann commented 5 years ago

right, so part of the problem was that downward results were not being transferred from cliq sub graphs back to main dfg -- this was an oversight in the new CSM code. However, the issue is not yet fully resolved since downward messages from root into the target clique might have inconsistencies, and I'm looking specifically at that now.

dehann commented 5 years ago

performance should be better with IIF v0.7.7 or greater. Keep open for rigorous testing

dehann commented 4 years ago

need to retest -- currently on IIF v0.8.3 with many improvements since this issue was logged.

dehann commented 4 years ago

Will re-evaluate after JuliaRobotics/IncrementalInference.jl#579 is fixed