Starfish-develop / Starfish

Tools for Flexible Spectroscopic Inference
https://starfish.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
68 stars 22 forks source link

chebyshev correction for SpectrumModel #137

Closed mileslucas closed 3 years ago

mileslucas commented 3 years ago

This PR adds functionality within SpectrumModel for doing Chebyshev correction. I've decided that the coefficients will start from the 2nd term, since the 1st (~linear~ constant) chebyshev coefficient is fully degenerate with log_scale. This means if you specify cheb=[0.1, 0.1] the full chebyshev coefficients used are [1, 0.1, 0.1]

General usage will look like:

# add via constructor
model = SpectrumModel(..., cheb=[0.1, 0.1])
# add using flatterdict keys
model["cheb"] = [-0.2, 0.1]
model["cheb:0"] = 0
# equivalently using properties, more for internal usage
model.cheb = ...

Currently this is working in terms of instantiation and evaluating/correcting the fluxes, but there is a bug related to the inner parameter data structure that doesn't allow the model["cheb:0"] = X syntax (https://github.com/gmr/flatdict/issues/44). This means the whole set_param_dict interface is broken, so we can't train the model.

If that packages' author is not responsive, I can hack in a workaround that manually checks for "cheb" keys, but I'd really rather not muddy up the code here if it can be resolved upstream. Once that bug is resolved, what's left should just be some unit-testing and making sure the docs are up to date.

mileslucas commented 3 years ago

Here are some examples of how it looks currently-

from Starfish.models import SpectrumModel

model = SpectrumModel(
    "F_SPEX_emu.hdf5",
    data,
    grid_params=[6800, 4.2, 0],
    Av=0,
    cheb=[-0.5, -0.1],
    global_cov=dict(log_amp=38, log_ls=2),
)
model
SpectrumModel
-------------
Data: Example Spectrum
Emulator: F_SPEX_emu
Log Likelihood: None

Parameters
  Av: 0
  cheb: [-0.5, -0.1]
  global_cov:
    log_amp: 38
    log_ls: 2
  T: 6800
  logg: 4.2
  Z: 0
model.plot()

Screen Shot 2021-03-10 at 5 05 27 PM

iancze commented 3 years ago

This is looking great, thanks Miles! I agree with you that the Chebyshev "parameters" should start from the second item in the list.

Just to clarify terminology though, I think its the $T_0$ term that is a constant that is a flat line and thus degenerate with log_scale. The $T_1$ is the linear term and is needed in the expansion.

If the flatdict author is reasonably responsive and the fix is straightforward, I think it makes sense to wait for an upstream resolution. But it sounds like in the meantime there is still a way to work with the core Chebyshev object without the parameter data structure?

Just let me know if/when you think this should be merged, it looks good to me.

mileslucas commented 3 years ago

Thanks for the terminology clarification- I'll make sure the docs are consistent!

I think I'll end up manually coding in the interface (I don't want the flat key for the first coefficient, which is $c_1$, to be "cheb:0"). I can get that done within the week along with the tests and such, so I'll turn this into a draft pull till I've got that done.

mileslucas commented 3 years ago

Sometimes in python, to get a list to work, you need to make it a dictionary, and then turn it into a list. (Is this a riddle, a gripe, or an exclamation? maybe all 3)

Chebyshev coefficients can be added in the following ways

# all equivalent
SpectrumModel(..., cheb=[0.1, -0.2], ...)
model["cheb"] = [0.1, -0.2]
model["cheb:1"] = 0.1
model["cheb:2"] = -0.2

in addition, coeffs can be inserted at arbitrary index

model["cheb:5"] = -0.1
model.cheb == [0.1, -0.2, 0, 0, -0.1]
# this will error!
model["cheb:0"] = ...

The coefficients are stored in a dictionary internally, so some translation happens when getitem/setitem gets called, but this allows us to index starting at 1:

model.labels
('Av',
 'cheb:1',
 'cheb:2',
 'T',
 'Z')

the coeffs can be frozen and thawed similarly to global_cov meaning that freezing "cheb" will fix all of the coefficients, but you could also freeze a single coeff with "cheb:{k}" where k is the coeff index.

Locally I was able to recreate the MAP optimization in the "Single Spectrum Example" notebook using 3rd order chebyshev correction, so this should be ready to go.

iancze commented 3 years ago

This interface is really nice! Thanks again for all of the hard work. Looks ready to merge to me if you're happy with it.