Starfish-develop / Starfish

Tools for Flexible Spectroscopic Inference
https://starfish.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
69 stars 22 forks source link

Priors on stellar parameters #32

Open gully opened 8 years ago

gully commented 8 years ago

Priors on stellar parameters

The problem: I suspect that a common use case might resemble the following scenario: A user would like to force the value of a parameter to either a single value, or else a distribution- for example a mean and variance. Perhaps she has astro-seismic log g, or metallicity measured from a companion. Or she might wish to constrain more than one stellar property or calibration parameter. The existing framework currently assumes uniform priors for stellar parameters over the library interval. There exists a "fix_logg" mechanism to fix the log g to a singular value, but it's labeled as a "durty hack".

Suggested solution: Add a mechanism for picking among an array of priors, or a user configurable prior. The default prior would remain as stated in Czekala et al. 2015, but through some flag, a user could specify a normal distribution, or whatever.

iancze commented 8 years ago

I'm happy that we're at a stage where we can start removing some of my "durty hacks" :neckbeard: :blush:

This may be a bit more dangerous than what you suggested, but what about having users actually write their own prior in python code, in a prior.py file stored in the local directory? This file would be imported into the sampling script, and would provide a function that takes in a ThetaParam or PhiParam object and returns the prior (or -np.inf). This would provide a lot more flexibility over possible distributions than we could hope to achieve with Gaussians. I'm still at a loss on how we can allow users to easily fix parameters to values (e.g. log g = 4.29), but I think this prior.py file would cover a lot of use cases.

The cons are that someone might write code that doesn't properly interface with the package, but I think with some proper example files that this might be a mostly avoidable.

gully commented 8 years ago

I really like the idea of having a user-defined prior.py file! I think this is the way to go.

A thought on fixing parameters-- the only thing I can think of is to make a very narrow Gaussian and then set the jump size equal to (or much less than) the very narrow Gaussian sigma. This too is a hack, and has the drawback that it would eventually throw away samples, but in principle even very well constrained parameters must have some uncertainty anyways.

gully commented 8 years ago

Here is a sketch of a solution I'm working on in a fork of this repo. It doesn't leverage the ThetaParam object because of differences in the implementation, but the main idea is there. If the user wishes to apply a prior, she/he would need to:

  1. Write a file with any name that defines a function user_defined_prior(), which takes a vector of the stellar parameters as an input.
  2. In the config.yaml file, add an entry: Theta_priors: which is the path to the file.
def lnprior(p):
    if not ( (p[11] > 0) and (p[6] < p[0]) ):
        return -np.inf

# Try to load a user-defined prior
try:
    sourcepath_env = Starfish.config['Theta_priors']
    sourcepath = os.path.expandvars(sourcepath)
    with open(sourcepath, 'r') as f:
        sourcecode = f.read()
    code = compile(sourcecode, sourcepath, 'exec')
    exec(code)
    lnprior = user_defined_prior
    print("Using the user defined prior in {}".format(sourcepath_env))
except:
    pass
iancze commented 8 years ago

:+1: this looks excellent

I'm still wondering about how to fix the parameters. @dfm points out that unimportant nuisance parameters hurts your sampling efficiency, so while I suppose a narrow prior would work, I'm wondering if there is a way we can specify this in the config file like

parnames : [temp, logg, Z, etc...]
fix_params : [1, 2]
fix_values: [4.29, 0.0]

and have a conversion method between the MCMC proposal interface and the update_theta which would insert logg = 4.29 and Z = 0.0

iancze commented 7 years ago

Here is a possible implementation of methods to sample with fixed parameters: https://github.com/iancze/PSOAP/blob/master/psoap/utils.py

mileslucas commented 5 years ago

In the spirit of our recent discussion, I actually really like the idea of not baking in priors at all for the primary fitting (not the emulator fitting). If we can define a model that has some log_like method, we let users decide if they want to hyper-parametrize the values. Something like this:

import scipy.stats as st

def ln_like(Theta, Phi, **etc):
    T, logg, Z = Theta
    rv_T = st.uniform(3000, 4000)
    rv_logg = st.uniform(4.0, 5.0)
    rv_Z = st.uniform(-0.5, 0.5)
    primary_model = Starfish.ExtremeRobustModelNoBugsEver(Theta, Phi, **etc)
    return primary_model.log_prob(data.flux) + rv_T.logpdf(T) + rv_logg.logpdf(logg) + rv_Z.logpdf(Z)

sampler = Starfish.sampler.TheBestSamplerLiterallyEver(tune=500, kernel_lag=500, samples=10000)
p0 = ...

trace = sampler.sample(ln_like, p0, progress=True)