Closed keflavich closed 6 years ago
Maybe we should the examples folder outside of multicomponentfitting?
I like the examples to be importable. You won't always want to do it but it can be nice.
I run the examples from a completely different directory anyway.
Are there particular reasons to put it in the root directory?
No, that's fine. I just didn't want to move into another directory to run the examples :)
yeah, at this point, you should be python setup.py develop
ing the whole package and maybe %run
ing the full path to the example files
Yeah I'm doing that now. though I still get ModuleNotFoundError
s from from minicube_fit import minicube_model, unconstrained_fitter
.
Is this a py3 only thing? Import a submodule and then import things from within it?
actually I don't think you can do that at all... so it's probably just a mistake I made
What was weird was that it didn't raise an error on that line a few times. It would raise an error on the import on the line below it. Anyways, moving on...
Well, on the two-component stuff.... it seems that it's pretty easy to find a local minimum where the two planes perfectly cancel each other
pymc3 is performing well for a simple two-component spatial fit! The red (not the residual line) and magenta lines are the two components as fit w/ pymc. The blue is from lmfit and green is using emcee. This isn't a completely fair comparison for emcee though since it's using the bad starting values from lmfit. But it isn't able to make any improvements.
There are two main issues to address:
centerdx
and centerdy
until it gets a good fit. So we need the good initial guess framework.I've added an attempt to use spatial connectivity in the pymc models. The Bernoulli parameters for each component, which control whether each spectrum is included in the fit, have varying probabilities of success now that depend on the neighboring structure of each pixel. @low-sky, turns out this is difficult to do in the actual MCMC b/c on
depends on p
that depends on on
, and the sampling in pymc is too much of a black-box to pull out the previous values from each iteration. Instead, you need to give it an initial guess of the which components are on in which pixels. e.g., guess['on0'] = model.sum(0) > 0
.
Then the p
values are computed using the neighbouring structure around each pixel, weighted by an assumed kernel function (I used a 2D Gaussian). This also has to be passed as a guess: guess['p0'] = spatial_covariance_structure(guess['on0'], stddev=kern_width)
.
I did some simple tests with two components with opposing amplitude gradients and it works really well in the overlap regions. It really needs more test cases thrown at it, but it might help a lot with the whole component switching issue.
More testing with spatial connectivity:
minicube_example.py
). All of the examples I could find using these methods were machine learning applications, so maybe all of these methods are just inappropriate for our models?scipy.optimize.fmin_bfgs
(this is the pm.find_MAP
call). We should be able to plug in any scipy minimization routine here so it would be cool to see how differential evolution works here! If we can get a robust and quick estimate, then the parameters in the chain can be tightly constrained and the NUTS sampler should run a lot quicker (time doesn't scale as strongly with the number of parameters for it).I can't say I'm following 100% - there are methods you're playing with that I don't know anything about - but to your last point, I have the feeling that all the methods I tried are most sensitive to variations in line width, which I usually think of as the parameter least likely to significantly vary. However, there are interesting cases - such as the 'coherent core' case - that do have significant linewidth variations over relatively small physical scales.
I think the sensitivity to width is a common theme for all of these methods then. We can make the default fit settings strongly prefer no change in width then (a really narrow prior distribution), but make it possible to change these when we have a good idea that the data will need it (I still have these hard-wired in). The same goes for the other gradients too. The center gradients could use a local approximation to the rotation curve, etc..
This is more physics than code, but I think that the constant-width prior is a good assumption. I wanted to comment specifically on the dense core case: I'm particularly interested in picking out the wide line width component in the presence of the coherent dense core. Thus, having a wide component persist across the core is a really compelling feature here.
our discussion here was great, but I think we want the code merged in...
This is where we'll need some of those high-dimensional parameter space explorers.....