Multi-component minicube fitting

keflavich commented 7 years ago

This is where we'll need some of those high-dimensional parameter space explorers.....

e-koch commented 7 years ago

Maybe we should the examples folder outside of multicomponentfitting?

keflavich commented 7 years ago

I like the examples to be importable. You won't always want to do it but it can be nice.

I run the examples from a completely different directory anyway.

Are there particular reasons to put it in the root directory?

e-koch commented 7 years ago

No, that's fine. I just didn't want to move into another directory to run the examples :)

keflavich commented 7 years ago

yeah, at this point, you should be python setup.py developing the whole package and maybe %runing the full path to the example files

e-koch commented 7 years ago

Yeah I'm doing that now. though I still get ModuleNotFoundErrors from from minicube_fit import minicube_model, unconstrained_fitter.

e-koch commented 7 years ago

Is this a py3 only thing? Import a submodule and then import things from within it?

keflavich commented 7 years ago

actually I don't think you can do that at all... so it's probably just a mistake I made

e-koch commented 7 years ago

What was weird was that it didn't raise an error on that line a few times. It would raise an error on the import on the line below it. Anyways, moving on...

keflavich commented 7 years ago

Well, on the two-component stuff.... it seems that it's pretty easy to find a local minimum where the two planes perfectly cancel each other

e-koch commented 7 years ago

pymc3 is performing well for a simple two-component spatial fit! The red (not the residual line) and magenta lines are the two components as fit w/ pymc. The blue is from lmfit and green is using emcee. This isn't a completely fair comparison for emcee though since it's using the bad starting values from lmfit. But it isn't able to make any improvements.

example_two_comp

e-koch commented 7 years ago

There are two main issues to address:

I had to give it good initial values for the centers. If it's off, it will just inflate centerdx and centerdy until it gets a good fit. So we need the good initial guess framework.
It has trouble determining which spectral component belongs to which spatially-connected region when the spectral components are close together (i.e., merging). This is the whole "how do you stop the chain from flopping between different components issue" again.

e-koch commented 7 years ago

I've added an attempt to use spatial connectivity in the pymc models. The Bernoulli parameters for each component, which control whether each spectrum is included in the fit, have varying probabilities of success now that depend on the neighboring structure of each pixel. @low-sky, turns out this is difficult to do in the actual MCMC b/c on depends on p that depends on on, and the sampling in pymc is too much of a black-box to pull out the previous values from each iteration. Instead, you need to give it an initial guess of the which components are on in which pixels. e.g., guess['on0'] = model.sum(0) > 0. Then the p values are computed using the neighbouring structure around each pixel, weighted by an assumed kernel function (I used a 2D Gaussian). This also has to be passed as a guess: guess['p0'] = spatial_covariance_structure(guess['on0'], stddev=kern_width).

I did some simple tests with two components with opposing amplitude gradients and it works really well in the overlap regions. It really needs more test cases thrown at it, but it might help a lot with the whole component switching issue.

example_two_comp_wspatialcovariance

e-koch commented 7 years ago

More testing with spatial connectivity:

I tried using the variational methods in pymc (specifically the SVGD method) to see if they could be used for sampling instead of using the whole MCMC. I couldn't get anything reasonable using single component models (minicube_example.py). All of the examples I could find using these methods were machine learning applications, so maybe all of these methods are just inappropriate for our models?
The start of the chain is from scipy.optimize.fmin_bfgs (this is the pm.find_MAP call). We should be able to plug in any scipy minimization routine here so it would be cool to see how differential evolution works here! If we can get a robust and quick estimate, then the parameters in the chain can be tightly constrained and the NUTS sampler should run a lot quicker (time doesn't scale as strongly with the number of parameters for it).
I tried extending the simple 2-component model to 3 components, where the third is a low SB, wide component across the region. A lot of the spectra look they would be well-approximated by a single Gaussian now. And it did surprisingly well... It started with initial guesses that are mostly wrong, though it was given close to the correct spatial connectivity and that is definitely driving the fit here. The parameters that are usually not well-fit are the sigma gradients. Without enough samples in the chain, it tends to inflate the width gradients in the wide component and that causes most of the parameters to be perturbed a bit. We could just constrain how much sigma can change a lot more. Are there some good cases where you would expect the width of the same component to change a lot?

example_three_comp

keflavich commented 7 years ago

I can't say I'm following 100% - there are methods you're playing with that I don't know anything about - but to your last point, I have the feeling that all the methods I tried are most sensitive to variations in line width, which I usually think of as the parameter least likely to significantly vary. However, there are interesting cases - such as the 'coherent core' case - that do have significant linewidth variations over relatively small physical scales.

e-koch commented 7 years ago

I think the sensitivity to width is a common theme for all of these methods then. We can make the default fit settings strongly prefer no change in width then (a really narrow prior distribution), but make it possible to change these when we have a good idea that the data will need it (I still have these hard-wired in). The same goes for the other gradients too. The center gradients could use a local approximation to the rotation curve, etc..

low-sky commented 7 years ago

This is more physics than code, but I think that the constant-width prior is a good assumption. I wanted to comment specifically on the dense core case: I'm particularly interested in picking out the wide line width component in the presence of the coherent dense core. Thus, having a wide component persist across the core is a really compelling feature here.

keflavich commented 6 years ago

our discussion here was great, but I think we want the code merged in...

keflavich / MultiComponentFitting

Multi-component minicube fitting #8