Closed keflavich closed 7 years ago
I didn't know lmfit had emcee built-in! Pretty cool. I added sampling with emcee after the initial fit. There are significant covariances amongst the parameters and the LSQ fitting drastically underestimates uncertainties. But LSQ is finding the right solution.
In the plot above, blue is the LSQ fit and green is whatever lmfit does with the MCMC chain (mean/median).
Blue points/lines are the true parameter values.
import corner
corner.corner(result_mcmc.flatchain, truths=[1, 0.5, -0.1, 25, 1, -2, 10, 0.1, 0.01])
Whoa. That is awesome. It seems like lmfit is very close to entirely replacing all of the machinery inside pyspeckit's modeling... it was already on track to doing that a few years ago, but that's more than I realized.
For reference:
LSQ Parameters:
amp: 0.989531973392+/-0.00945122113088
ampdx: 0.49758270655+/-0.0113077148051
ampdy: -0.0889143102971+/-0.011206571752
center: 24.9081548367+/-0.139540130069
centerdx: 0.92367050196+/-0.163822157184
centerdy: -2.00230155592+/-0.125194913143
sigma: 9.95718518955+/-0.138405881782
sigmadx: -0.118265511575+/-0.159553151103
sigmady: -0.0768345934984+/-0.121272045507
MCMC Parameters:
amp: 0.959438792617+/-0.102378578339
ampdx: 0.475841863016+/-0.118451231058
ampdy: -0.0867459598074+/-0.117968857225
center: 24.7914360854+/-1.41187539051
centerdx: 0.937878648285+/-1.7562454481
centerdy: -2.020092741+/-1.38449788694
sigma: 10.4336918183+/-1.61957667474
sigmadx: -0.0506641539386+/-1.80038888925
sigmady: 0.0170304407855+/-1.29289209414
Some of the gradient distributions aren't symmetric so the standard errors are not really +/stderr.
I made the example a little more realistic: the usual use case is that we want to fit a faint profile based on an adjacent one. Unfortunately, this reveals the potential to get a strong width gradient, which we probably don't want to allow. Maybe leaving the width as a fixed parameter is a good idea; generally I would expect that when you use this sort of fitter, you don't have enough information to independently measure the width anyway.
Forcing the model positive helps.
@keflavich - Are you forcing amp
to be positive in the model?
No, forcing the model itself to be positive. There can be predicted individual amplitudes (i.e., amplitude of a corner) that is negative, and that's technically valid and even physically reasonable as long as you interpret it as 'no signal' rather than 'negative signal'
Here's another possible approach: adding a parameter to sample which pixels are noise only.
on = pm.Bernoulli('on', p=0.9, shape=data.shape[1:])
.
It's pretty good at picking out the pixels which don't add anything to the fit. This is the number of iterations where a pixel is considered important to the fit (total of 500 iterations).
That's cool. That's also black magic. This is an MCMC-only tool, right? Or can this be incorporated into the likelihood function when using gradient-descent...?
Umm. Yeah I think has to be MCMC-only. For a non-sampling approach, having a "turn-off" parameter would do something similar, but that might make for a real nasty likelihood space. Or you treat it as some sort of internal "model selection". I don't have a good idea of how that would work.
I changed HalfNormal
to a bounded Normal. HalfNormal is going to cause all of the widths to most likely be 0.
So the bounded normal works well, but I can't figure out how to fix the lower bound and not fit it.
You can't even set the lower bound to be a variable itself. It has to be a value. The lack of documentation is killing me.
is PositiveNormal
a special case of TruncatedNormal
? https://stackoverflow.com/a/32654901/814354
EDIT: Maybe that's what you're doing already and I just looked at outdated code...
So back to HalfNormal
and put a hyperprior on width?
But HalfNormal
has the mean fixed to 0. We should still be able to set the most likely value from an initial guess.
Yeah, they just took TruncatedNormal
out of pymc3 for some reason. I guess this approach is more general.
I can't access the lower bound in trace
when basic_model.sigma_lowerbound__ = pm.model.ObservedRV(tt.Variable(0))
is set. Maybe this means it isn't being sampled?
That's the goal, right, Not sampled? The model should not have it in the model.free_RVs
property.
Given my own reading, this seems consistent with what's needed, namely a dive into theano
It is still in model.free_RVs
, but you can't access any of the samples. trace.get_values("sigma_lowerbound__")
Nevermind. It's definitely still free.
A relatively blind attempt on Vlas's data:
doesn't look great. But the next step is to compare this to other techniques over a full cube...
...or to implement a multicomponent version. Right now, minicube
is useful to extend fits into low-S/N regions, but is not useful for multicomponent analysis.
A nice example of why "unconstrained" is stupid:
With trivial constraints (central amplitude, width cannot be negative), this case works out pretty well.
Merging. Further development will go into a new PR
A note on the bounded normal: the "sigma_lowerbound__" isn't fitting the lower bound. It's the sample of sigma transformed from the bounded space to one that isn't bounded. pymc3 does the same thing for anything with a uniform prior (https://stackoverflow.com/questions/34577875/what-are-the-interval-transforms-in-pymc3-for-uniform-distributions).
WIP