Pathelogical priors example (i.e. highly over-confident) shows numerical parasitics

Affie commented 4 years ago

On masters of DFG (v0.8) and IIF

Here are 2 basic examples that does not work correctly.

fg = initfg()
x = addVariable!(fg, :x0, ContinuousScalar)
p1 = addFactor!(fg, [:x0], Prior(Normal(30,3)))
p2 = addFactor!(fg, [:x0], Prior(Normal(-30,3)))
solveTree!(fg)
pkde = plotKDE(fg, [:x0])

y1 is what I would expect, rest are results from rerunning the example 4 times (y2-y5)

fg = initfg()
x = addVariable!(fg, :x0, ContinuousScalar)
Prior0 = MixturePrior([Normal(-30.0,3); Normal(30,3)], Categorical([0.5;0.5]))
p0 = addFactor!(fg, [:x0], Prior0)
p1 = addFactor!(fg, [:x0], Prior(Normal(30,3)))
p2 = addFactor!(fg, [:x0], Prior(Normal(-30,3)))
solveTree!(fg)
pkde = plotKDE(fg, [:x0])

dehann commented 4 years ago

Hi, thanks! I will take a look.

Affie commented 4 years ago

Maybe this one also helps

dehann commented 4 years ago

oh, sorry im not following whats happening on this second image -- where are all the y values coming from?

dehann commented 4 years ago

Also please add the preamble you using to generate the plots -- these are done with RoMEPlotting.plotKDE and Plots.jl? I'm assume you have local changes to combine Plots.jl and RoMEPlotting.jl ... so its bit hard to see where the second plot with all y traces is coming from?

Affie commented 4 years ago

The second plot is the same as the first, just repeated a bunch of times in a loop. It’s just meant to show the range of the result.

I struggle with RoMEPlotting to actually display and combine plots so i used it to generate the plot and then combined the plots with Plots.jl manually. So it’s not the output of the script ran once, but repeatedly.

dehann commented 4 years ago

Hi @Affie, so this case is not quite as straight forward as "correct" or "incorrect". What you are expecting as a parametric result in the middle should be viewed as a histogram (or summation) of all the cases shown in your second plot. Statistically the combined event could be anywhere between the two narrow priors.

So a "bug" would be more if the product belief lies outside the two priors, or something like that. On the one hand we would like a pure settled Gaussian mode precisely between the two prior values (as a quick parametric calculation would give). The parasitics of the numerical methods used introduces the variation.

On the positive side, by forcing two priors so narrow and far apart, you are effectively saying that anything between them is fair game (think of it like a poorly conditioned matrix, or something analogous).

Performance should be much more repeatable if the two measurements (priors) are not taken to be so confident (narrow). Are you sure that at least one of the two priors are not overconfident? See example below.

Also, the second example with third mixture prior is a very different animal. The mixture prior is a summation of two modes. The first example is a product of two Gaussian priors.

Stochastic Example:

using IncrementalInference, RoMEPlotting

fg = initfg()
addVariable!(fg, :x0, ContinuousScalar)
addFactor!(fg, [:x0], Prior(Normal(30,20)))
addFactor!(fg, [:x0], Prior(Normal(30,-20)))

ARR = BallTreeDensity[]
[(solveTree!(fg); push!(ARR,getKDE(fg, :x0))) for i in 1:30]
plotKDE(ARR) # plot3
sms = hcat((rand.(ARR, 1000))...)
Gadfly.plot(x=sms, Geom.hist) # plot4

Screenshot from 2020-06-03 00-13-04

Screenshot from 2020-06-03 00-37-43

Parametric

Analytical product is pretty close (after scaling) to the aggregate samples from the 30 products above:

pr(x) = exp((-x^2-900)/400)/800/pi   # WolframAlpha

using Colors

Gadfly.plot(
        Gadfly.layer(x=-50:0.1:50, y=620*pr.(-50:0.1:50), Geom.line, Theme(default_color=colorant"red")),
        Gadfly.layer(x=sms, Geom.histogram(density=true)) 
)

Conclusion

The question is whether the current generation solver should be updated to address this case (where behaviour is less reliable in regions that are statistically highly unlikely (ie narrow and apposing priors), or whether your use case should actually allow more uncertainty to at least one of the priors.

On the mixture prior case, this would occur if you have two options in one measurement and you are not sure which case is manifesting here, so by adding the two modes together into a mixture prior you allow the freedom for that selection to be made during inference. (also see multihypo interface for related cases).

In general, we would like to improve the parasitic performance of the numerical/stochastic method (to be similar to the parametric counterpart), and my feeling is that its best to leave that for gen2 -- unless there is a real strong reason to dive in here.

Affie commented 4 years ago

Thanks for the thorough explanation. There are 2 reasons why I looked at it, I wanted to confirm the covariance is calculated correctly for the parametric solves, and I thought it might be the cause of the numerical inaccuracies I'm seeing in larger solutions. It looks like there is good agreement on the examples with the larger variance, so I think parametric works as it should. It's just the tight variance that gives a different result. Sorry for the unclear images previously, your histograms gives a much clearer picture.

Just for clarity, here are the histograms based on your code examples but with the tight (3) variance.

Is there any parameter to tune to get less numerical parasitics, even if it comes at a cost of more computation time?
Regarding the second example. Is there a way to get 2 confident measurements to rather produce 2 modes rather than smear to somewhere in between without the miltihypo factor.

dehann commented 4 years ago

I'd have to look more closely, but in the meantime:

Is there any parameter to tune to get less numerical parasitics, even if it comes at a cost of more computation time?

try getSolverParams(fg).N = 300 or something like that.

Regarding the second example. Is there a way to get 2 confident measurements to rather produce 2 modes rather than smear to somewhere in between without the miltihypo factor.

If I understand you correctly, then using only the mixture prior you added in the second example should be what you want (without the unimodal priors).

Affie commented 3 years ago

I think I'm still running into this issue (the effects of) with multihypo, it should not happen after #1010, so I'll re-evaluate after that is in.

Affie commented 2 years ago

Bearing only issues (https://github.com/JuliaRobotics/RoME.jl/issues/344) possibly relates to this issue

JuliaRobotics / IncrementalInference.jl