keflavich / MultiComponentFitting

Sprint repository for multi-component fitting. To be migrated to radio-astro-tools when ready
0 stars 0 forks source link

minicube modeling #3

Closed keflavich closed 7 years ago

keflavich commented 7 years ago

WIP

e-koch commented 7 years ago

example_sigma_10

e-koch commented 7 years ago

I didn't know lmfit had emcee built-in! Pretty cool. I added sampling with emcee after the initial fit. There are significant covariances amongst the parameters and the LSQ fitting drastically underestimates uncertainties. But LSQ is finding the right solution.

In the plot above, blue is the LSQ fit and green is whatever lmfit does with the MCMC chain (mean/median).

Blue points/lines are the true parameter values.

import corner
corner.corner(result_mcmc.flatchain, truths=[1, 0.5, -0.1, 25, 1, -2, 10, 0.1, 0.01])

example_sigma_10_cornerplot

keflavich commented 7 years ago

Whoa. That is awesome. It seems like lmfit is very close to entirely replacing all of the machinery inside pyspeckit's modeling... it was already on track to doing that a few years ago, but that's more than I realized.

e-koch commented 7 years ago

For reference:

LSQ Parameters:
amp: 0.989531973392+/-0.00945122113088
ampdx: 0.49758270655+/-0.0113077148051
ampdy: -0.0889143102971+/-0.011206571752
center: 24.9081548367+/-0.139540130069
centerdx: 0.92367050196+/-0.163822157184
centerdy: -2.00230155592+/-0.125194913143
sigma: 9.95718518955+/-0.138405881782
sigmadx: -0.118265511575+/-0.159553151103
sigmady: -0.0768345934984+/-0.121272045507
MCMC Parameters:
amp: 0.959438792617+/-0.102378578339
ampdx: 0.475841863016+/-0.118451231058
ampdy: -0.0867459598074+/-0.117968857225
center: 24.7914360854+/-1.41187539051
centerdx: 0.937878648285+/-1.7562454481
centerdy: -2.020092741+/-1.38449788694
sigma: 10.4336918183+/-1.61957667474
sigmadx: -0.0506641539386+/-1.80038888925
sigmady: 0.0170304407855+/-1.29289209414

Some of the gradient distributions aren't symmetric so the standard errors are not really +/stderr.

keflavich commented 7 years ago

I made the example a little more realistic: the usual use case is that we want to fit a faint profile based on an adjacent one. Unfortunately, this reveals the potential to get a strong width gradient, which we probably don't want to allow. Maybe leaving the width as a fixed parameter is a good idea; generally I would expect that when you use this sort of fitter, you don't have enough information to independently measure the width anyway.

image

keflavich commented 7 years ago

Forcing the model positive helps.

forced_positive

e-koch commented 7 years ago

@keflavich - Are you forcing amp to be positive in the model?

keflavich commented 7 years ago

No, forcing the model itself to be positive. There can be predicted individual amplitudes (i.e., amplitude of a corner) that is negative, and that's technically valid and even physically reasonable as long as you interpret it as 'no signal' rather than 'negative signal'

e-koch commented 7 years ago

Here's another possible approach: adding a parameter to sample which pixels are noise only. on = pm.Bernoulli('on', p=0.9, shape=data.shape[1:]).

It's pretty good at picking out the pixels which don't add anything to the fit. This is the number of iterations where a pixel is considered important to the fit (total of 500 iterations).

example_sigma_10_bernoulli_spatialout

keflavich commented 7 years ago

That's cool. That's also black magic. This is an MCMC-only tool, right? Or can this be incorporated into the likelihood function when using gradient-descent...?

e-koch commented 7 years ago

Umm. Yeah I think has to be MCMC-only. For a non-sampling approach, having a "turn-off" parameter would do something similar, but that might make for a real nasty likelihood space. Or you treat it as some sort of internal "model selection". I don't have a good idea of how that would work.

e-koch commented 7 years ago

I changed HalfNormal to a bounded Normal. HalfNormal is going to cause all of the widths to most likely be 0.

e-koch commented 7 years ago

So the bounded normal works well, but I can't figure out how to fix the lower bound and not fit it.

You can't even set the lower bound to be a variable itself. It has to be a value. The lack of documentation is killing me.

keflavich commented 7 years ago

is PositiveNormal a special case of TruncatedNormal? https://stackoverflow.com/a/32654901/814354

EDIT: Maybe that's what you're doing already and I just looked at outdated code...

low-sky commented 7 years ago

So back to HalfNormal and put a hyperprior on width?

e-koch commented 7 years ago

But HalfNormal has the mean fixed to 0. We should still be able to set the most likely value from an initial guess.

e-koch commented 7 years ago

Yeah, they just took TruncatedNormal out of pymc3 for some reason. I guess this approach is more general.

e-koch commented 7 years ago

Hmm. https://github.com/pymc-devs/pymc3/pull/1942

e-koch commented 7 years ago

I can't access the lower bound in trace when basic_model.sigma_lowerbound__ = pm.model.ObservedRV(tt.Variable(0)) is set. Maybe this means it isn't being sampled?

low-sky commented 7 years ago

That's the goal, right, Not sampled? The model should not have it in the model.free_RVs property.

Given my own reading, this seems consistent with what's needed, namely a dive into theano

e-koch commented 7 years ago

It is still in model.free_RVs, but you can't access any of the samples. trace.get_values("sigma_lowerbound__")

e-koch commented 7 years ago

Nevermind. It's definitely still free.

keflavich commented 7 years ago

A relatively blind attempt on Vlas's data:

example_fit_on_toy_data

doesn't look great. But the next step is to compare this to other techniques over a full cube...

...or to implement a multicomponent version. Right now, minicube is useful to extend fits into low-S/N regions, but is not useful for multicomponent analysis.

keflavich commented 7 years ago

A nice example of why "unconstrained" is stupid:

negative_sigma

keflavich commented 7 years ago

With trivial constraints (central amplitude, width cannot be negative), this case works out pretty well.

constrained

keflavich commented 7 years ago

Merging. Further development will go into a new PR

e-koch commented 7 years ago

A note on the bounded normal: the "sigma_lowerbound__" isn't fitting the lower bound. It's the sample of sigma transformed from the bounded space to one that isn't bounded. pymc3 does the same thing for anything with a uniform prior (https://stackoverflow.com/questions/34577875/what-are-the-interval-transforms-in-pymc3-for-uniform-distributions).