Results empirically observed degenerate within low noise regimes

adthoms commented 9 months ago

@alanpapalia I am running CORA on a simulated experiment with highly accurate wireless sensing and odometry. For context, the simulated experiment is illustrated in the figure below, where a robot (blue) completes a single square loop while ranging to a constellation of landmarks (red). Range measurements exist between all landmarks for each robot pose:

simulated_experiment

Executing CORA on the following factor_graph_low_noise.pyfg file, the output robot trajectory with respect to ground truth is:

cora_results_low_noise

Adjusting the noise values of the range measurements and odometry measurements from 0.0001 to 0.001, the output robot trajectory with respect to ground truth is:

cora_results_higher_noise

Any insights on why this would be the case would be greatly appreciated.

alanpapalia commented 8 months ago

thanks for the detailed writeup - it was quite helpful in understanding what you're seeing! It's reassuring that slightly lower noise seems to work okay. That's telling me it's probably something due to numerical precision.

My guess is that either the solver is terminating early (before obtaining a certified soln) or the certification threshold (eta) needs to be tightened.

I would start by checking if the solver is obtaining a certifiably optimal solution (if so, what is the reported suboptimality gap?). I can write up how to check this if it's not immediately obvious how to find this out, so let me know if that'd be useful.

If not obtaining a certifiably optimal solution you could tighten the parameters on the solver or increase the maximum rank solution allowed.

If none of this seems to go anywhere let me know. I'll try running the problem (thanks again for providing it) and see what is going on.

adthoms commented 8 months ago

@alanpapalia Thank you for getting back to me. Here is the console log output:

Solving data/factor_graph.pyfg
WARNING: Logging iterates will slow down the optimization process.  This is intended for debugging and viz purposes only.

Solving problem at rank 4
Obtained solution with objective value: 466504463.795491
Result is certified: 0 with eta: 0.100000 and theta: -11700.793684

Solving problem at rank 5
Obtained solution with objective value: 466488712.190711
Result is certified: 0 with eta: 0.100000 and theta: -22001.141599

Solving problem at rank 6
Obtained solution with objective value: 466472961.974869
Result is certified: 0 with eta: 0.100000 and theta: -49814.221508

Solving problem at rank 7
Obtained solution with objective value: 466457213.147942
Result is certified: 0 with eta: 0.100000 and theta: -107002.636562

Solving problem at rank 8
Obtained solution with objective value: 466441465.709907
Result is certified: 0 with eta: 0.100000 and theta: -129397.302353

Solving problem at rank 9
Obtained solution with objective value: 466425719.660750
Result is certified: 0 with eta: 0.100000 and theta: -145326.119495

Solving problem at rank 10
Obtained solution with objective value: 466409975.000451
Result is certified: 0 with eta: 0.100000 and theta: -154569.260953

Projecting solution to rank 3 and refining.
Out of 200 blocks, 116 have positive determinant. This is 58.000000% of the total.
Obtained solution with objective value: 454269200.586797
Final solution is certified: 0 with eta: 0.100000 and theta: -154165.942442
CORA took 0.139278 seconds

A certifiably optimal solution is not obtained (i.e. Final solution is certified: 0). I have played around with tightening the solver parameters (with no success) and the solution rank does not exceed CORA's maximum.

Edit: Note that when solving the factor graph with higher noise values (as indicated in my earlier post), a certifiably optimal solution is not obtained (though visually it appeared that was the case). Here is the console output when noise values are increased

Solving data/factor_graph.pyfg
WARNING: Logging iterates will slow down the optimization process.  This is intended for debugging and viz purposes only.

Solving problem at rank 4
Obtained solution with objective value: 4.665970
Result is certified: 0 with eta: 0.000000 and theta: -0.000746

Solving problem at rank 5
Obtained solution with objective value: 4.665599
Result is certified: 0 with eta: 0.000000 and theta: -0.000222

Solving problem at rank 6
Obtained solution with objective value: 4.665465
Result is certified: 0 with eta: 0.000000 and theta: -0.000232

Solving problem at rank 7
Obtained solution with objective value: 4.665336
Result is certified: 0 with eta: 0.000000 and theta: -0.000516

Solving problem at rank 8
Obtained solution with objective value: 4.665211
Result is certified: 0 with eta: 0.000000 and theta: -0.000687

Solving problem at rank 9
Obtained solution with objective value: 4.665088
Result is certified: 0 with eta: 0.000000 and theta: -0.000409

Solving problem at rank 10
Obtained solution with objective value: 4.664965
Result is certified: 0 with eta: 0.000000 and theta: -0.001268

Edit: When running single_drone.pyfg with the range noise value changed from 0.09 to 0.0001, CORA terminates with the following console log:

Solving problem at rank 10
Obtained solution with objective value: 397328518.165995
Result is certified: 0 with eta: 0.397329 and theta: -425250.667631

Projecting solution to rank 3 and refining.
Out of 1754 blocks, 862 have positive determinant. This is 49.144812% of the total.
Obtained solution with objective value: 393836145.138260
Final solution is certified: 0 with eta: 0.393836 and theta: -425886.918593
CORA took 0.374625 seconds

which leads me to believe noise levels remain a factor in CORA's solver.

alanpapalia commented 8 months ago

Yeah that's definitely what's going on - that theta parameter that's being printed out is effectively the "certification gap". If theta == exactly zero then it's theoretically guaranteed to be optimal but due to numerical (im)precision that's unlikely to happen.

Instead we have a slight regularization parameter eta that is our optimality tolerance. Typically we'd want to see -1e-2 < theta <= 0 to feel good about optimality but this can depend on various things about the problem (e.g., norm of the data matrix, problem dimension, etc.)

Maybe cranking up the ranging precision is making the data matrix substantially ill-conditioned and that's leading to issues? I can take a look at this sometime this coming week.

adthoms commented 8 months ago

@alanpapalia thank you for the help, I will see if these issues persist (in the now deprecated) MATLAB implementation of CORA.

alanpapalia commented 8 months ago

Sounds good! Probably the first things I'd try to change to improve results would be

try different preconditioners (normally regularized Cholesky works the best but not guaranteed)
turn up the number of iterations per level of the Riemannian staircase
raise the max rank of CORA

Not at my laptop right now otherwise would indicate relevant lines of code

alanpapalia commented 8 months ago

Try checking out this new branch and see if it fixes your issues. You can look at the diff to see what I did. The most impactful change just appeared to be tightening the solver params, to make sure it was making more iterations.

This isn't terribly surprising because the problem is likely very ill-posed due to the range precisions being so high.

adthoms commented 8 months ago

@alanpapalia Re-running factor_graph_low_noise.pyfg with the new branch yields the following (Note that max_rank=50). Notice how theta falls outside of -1e-2 < theta <= 0 with the exception of rank=5

cora_high_precisions_numerical_errors_factor_graph_low_noise_fig

I do agree that the problem is ill-posed and (for now) scaling the covariance will do fine for my purposes. I will close this issue for now, though the new features added in the new branch look great :)

MarineRoboticsGroup / cora

Results empirically observed degenerate within low noise regimes #27