ssmit1986 / BayesianInference

Wolfram Language application for Bayesian inference and Gaussian process regression
MIT License
37 stars 4 forks source link

MAP estimate of parameters (How to?) #7

Closed spasetto closed 3 years ago

spasetto commented 3 years ago

Hi, I'm trying to give a go to your nice code. For example, cut and past exactly your example-notebook to generate the data:

BlockRandom[ SeedRandom[1]; testdata = (# -> (# + 2 RandomReal[])) & /@ RandomReal[{-1, 1}, 20]; ]

the object

obj = defineInferenceProblem[ "Data" -> testdata, "GeneratingDistribution" -> NormalDistribution[ a x + b, sigma ], "Parameters" -> {{a, -10, 10}, {b, -10, 10}, {sigma, 0.1, 10}}, "IndependentVariables" -> {x}, "PriorDistribution" -> { ProbabilityDistribution[Sqrt[2]/Pi/(1 + x^4), {x, -10, 10}],(here it is just an example...) UniformDistribution[{-10, 10}], "ScaleParameter" } ]

and I save everything on results:

result = nestedSampling[obj]

How do I get the max a posteriori values (rather than the averaged provided)? I've tried to sum #LogPriorPDF and #LogLikelihood but, for example,

Query[TakeLargestBy[#LogPriorPDF &, 1]]@result["Samples", All]

returns errors... "Failure..." etc. ThX

(BTW: Query[TakeLargestBy[#LogLikelihood &, 1]]@result["Samples", All] works fine if I want likelihood)

ssmit1986 commented 3 years ago

Hi,

Thanks for giving it a go. I see where the problem is occurring: the "LogPriorPDF" property isn't getting populated for the initial samples and that's why the query fails. I just pushed an update that fixes that problem. The line:

TakeLargestBy[#LogLikelihood + #LogPriorPDF &, 1] @ result["Samples"]

should work now if you get the latest code from the main branch (it's just a one line fix).

I think I missed that because I never really looked at MAP estimates with this code, since there are easier ways to get one with NMaximize or FindMaximum. A while ago I added an experimental function laplacePosteriorFit for that kind of thing, so you may want to give that a look. There is some documentation in the section "The Laplace approximation".

Best regards, Sjoerd

spasetto commented 3 years ago

Thanks, Sjoerd, for the prompt reply. Now it works as expected.

May I take the chance to ask you a couple of questions about a couple of problems I've experienced today trying your code?

1) What is the meaning of the error: checkCompiledFunction::mainEval: CompiledFunction LogLikelihoodFunction has calls to MainEvaluate and may not perform optimally

You should be able to reproduce it with this code simplified from an optimal-control problem: Let me set a control by hand (say, a time series):

Ts = TimeSeries[ {{33., 1.}, {75., 1.}, {106., 1.}, {165., 1.}, {194., 1.}, {221., 0.}, {256., 0.}, {279., 0.}, {312., 0.}, {341., 0.}, {379., 0.}, {428., 1.}, {456., 1.}, {484., 1.}, {526., 1.}, {561., 1.}, {603., 1.}, {638., 1.}, {666., 1.}, {694., 1.}, {722., 0.}, {750., 0.}, {792., 0.}, {834., 1.}, {876., 1.}, {914., 1.}, {932., 1.}, {960., 1.}, {1009., 1.}, {1051., 1.}, {1093., 1.}, {1135., 1.}, {1184., 1.}, {1226., 1.`}}, ResamplingMethod -> {"Interpolation", "InterpolationOrder" -> 0}];

and an ODE system as model: For ex. something little more juicy than a straight line might be:

Model[ p1?NumberQ, p2?NumberQ, p3?NumberQ, p4?NumberQ, ic1?NumberQ, ic2?NumberQ, ic3_?NumberQ ] := (Model[p1, p2, p3, p4, ic1, ic2, ic3] = Block[{S, DD, P, sol, [Epsilon] = 10^-16}, sol = First[NDSolve[ {S'[[Tau]] == S[[Tau]]/(S[[Tau]] + DD[[Tau]] + [Epsilon]) p1 Log[2.] S[[Tau]], S[33.] == ic1, Derivative[1][DD][[Tau]] == -p4 DD[[Tau]] Ts[[Tau]] + Log[2.] S[[Tau]] (1. - (p1 S[[Tau]])/( DD[[Tau]] + S[[Tau]] + [Epsilon]) ), DD[33.] == ic2, P'[[Tau]] == p2 DD[[Tau]] - p3 P[[Tau]], P[33.] == ic3}, {S, DD, P}, {[Tau], 33., 1226.}]]; P[#] & /. sol])

Now I set some data as you indicate in your example notebook, in my case I take:

data = {33.-> 11.83, 75.-> 5.02, 106.-> 4.87, 165.-> 5.1, 194.-> 2.86, 221.-> 2.04, 256.-> 2.29, 279.-> 3.44, 312.-> 3.47, 341.-> 5.52, 379.-> 6.83, 428.-> 10.62, 456.-> 21.2, 484.-> 9.75, 526.-> 9.64, 561.-> 8.52, 603.-> 7.61, 638.-> 7.43, 666.-> 5.95, 694.-> 6.8, 722.-> 5.64, 750.-> 7.06, 792.-> 7.55, 834.-> 12.89, 876.-> 12.6, 914.-> 11.32, 932.-> 11.45, 960.-> 13., 1009.-> 12.85, 1051.-> 14.32, 1093.-> 14.31, 1135.-> 12.65, 1184.-> 14.52, 1226.-> 15.78};

The mentioned warning appears when I define the inference problem:

obj = defineInferenceProblem[ "Data" -> data, "GeneratingDistribution" -> NormalDistribution[ Model[p1, p2, p3, p4, ic1, ic2, ic3][t], [Sigma]], "Parameters" -> {{p1, 0, 1}, {p2, 0, 1}, {p3, 0, 1}, {p4, 0, 1}, {ic1, 0, 1}, {ic2, 0, 1}, {ic3, 0, 30}, {[Sigma], 0.01, 10.}}, "IndependentVariables" -> {t}, "PriorDistribution" -> {"LocationParameter", "LocationParameter", "LocationParameter", "LocationParameter", "LocationParameter", "LocationParameter", "LocationParameter", "ScaleParameter"} ]

I guess the question is more... "How would you avoid it?"

2) the second question is more conceptual or Mathematica related. How would you impose some constraints on parameters? say p1 > p2? While in a NMaximize approach it is quite straightforward, in the framework of your package it results not obvious to me. (note, I do not mean to add prior constraints as, e.g., p1 as UniformDistribution[{0, 0.5}] and p2 as UniformDistribution[{0.5,1}])

Thank you very much for the insights into your beautiful project.

Best

Stefano

ssmit1986 commented 3 years ago

Hi Stefano,

Glad to head that that resolved it. As for your new questions:

  1. This is simply a warning message I put in to indicate that the automatic compilation of the loglikelihood and/or logprior functions failed. The code should still work, but the sampling process will be slower because of the lack of compilation. You can always directly define the "LogLikelihoodFunction" and ""LogPriorPDFFunction"" (or "LogPriorPDF", which expects a symbolic expression of the PDF) in defineInferenceProblem to provide your own compiled functions. With a model that invokes NDSolve, that's going to be tricky, though. You may instead want look into ParametricNDSolveValue to define your model, since that will allow Mathematica to do some of the symbolic pre-processing in advance rather than having to do it each time Model is invoked.

  2. I'm not sure how you can, mathematically, impose constraints other than by putting them in the loglikelihood or the prior functions. The whole inferencing is based on just those two functions, so any constraint will have to be imposed through one of them. You can return $MachineLogZero from the loglikelihood or logprior function when a point doesn't satisfy the constraints to signal that it's an illegal point. You may want to be careful with messing up the normalisation constant of the prior, though, since that makes the logevidence ill-defined. Again, in this situation it's probably easiest to define your own "LogLikelihoodFunction" property in defineInferenceProblem. There's also built-in functions like TruncatedDistribution that can help you with imposing constraints on distributions.

Hope that helps, Sjoerd

spasetto commented 3 years ago

Thank you! Best Stefano