DarkEnergySurvey / ugali

Ultra-faint galaxy likelihood toolkit
MIT License
15 stars 15 forks source link

Loading config in mcmc #55

Open sidneymau opened 6 years ago

sidneymau commented 6 years ago

Working on MCMC analysis with ugali and tyring to use g-i bands has shown that ugali.analysis.source seems to create a g-r isochrone even when the relevant config files specify band_1 = g and band_2 = i. This behavior has been evidenced in ugali/pipeline/run_05.0_followup.py and ugali/analysis/mcmc.py, but may be elsewhere.

sidneymau commented 6 years ago

After some digging, ugali/analysis/mcmc.py references ugali/analysis/loglike.py, where the source is made here. It looks like config.get(section) != None, but config.get(section).get('source') == None, so it ends up reading params == None. If this function is forced to use params = config.get('source') , then the params end up loading properly and we get out a g-i isochrone. It's not clear to me whether this is the expected behavior, or if we want to throw an error when getting the section is not none but getting the source from the section is none. Disregarding that, a simple fix is writing the function like:

def createSource(config, section=None, **kwargs):
    config = Config(config)
    source = Source()

    if (config.get(section) and config.get(section).get('source')) is not None:
        params = config.get(section).get('source')
    else:
        params = config.get('source')

    if params is not None:
        source.load(params)

    source.set_params(**kwargs)
    return source
sidneymau commented 6 years ago

This lead to some other pieces of ugali breaking... suggesting that support for bands other than g-r isn't universal, or I'm missing some masks:

Traceback (most recent call last):
  File "./ugali/analysis/mcmc.py", line 352, in <module>
    like = ugali.analysis.loglike.createLoglike(config,source)
  File "/home/s1/smau/software/ugali/ugali/analysis/loglike.py", line 600, in createLoglike
    observation = createObservation(config,lon,lat)
  File "/home/s1/smau/software/ugali/ugali/analysis/loglike.py", line 549, in createObservation
    mask = createMask(config,roi)
  File "/home/s1/smau/software/ugali/ugali/analysis/loglike.py", line 564, in createMask
    mask = ugali.observation.mask.Mask(config, roi)
  File "/home/s1/smau/software/ugali/ugali/observation/mask.py", line 52, in __init__
    self._photometricErrors()
  File "/home/s1/smau/software/ugali/ugali/observation/mask.py", line 349, in _photometricErrors
    pars_2 = MAGERR_PARAMS[release][band_2]
KeyError: 'i'
kadrlica commented 6 years ago

Didn't we talk about that last issue before? The updates to MAGERR_PARAMS would need to be in ugali/utils/constants.py.

kadrlica commented 6 years ago

I also think that we want the fix to be a bit deeper than createSource. When creating an isochrone it should always respect the bands provided in the config regardless of whether the source section exists or not. I think this means we need to dig into what the factory is doing when it creates the isochrone.

sidneymau commented 6 years ago

oh, I think my hacked changes to MAGERR_PARAMS were lost when I updated my fork, which is why I was confused to get that error again. I agree that createSource is not the root of the issue, but that's also the earliest place I see a config file getting passed to the Source object.

sidneymau commented 6 years ago

Will we want to look for a config file whenever we create an isochrone (i.e. in the factory)? or do we just want to load more parameters whenever we do read in a config file? I'm not sure how we would make this check whenever initializing an isochrone, which is what I think would need to happen at the factory level if we wanted to do that.

kadrlica commented 6 years ago

That's a good point. We would like to retain the ability to create a source without a config, which means that createSource is probably the place we need to do it. However, we want to have the default action be to read the proper mag_1 and mag_2 variables without any source defined.

sidneymau commented 6 years ago

Since we'll have to read this from the config file, how would this be best implemented? I've been playing around with it a bit, and we could have createSource look in the isochrone section of the config file and then on source.isochrone. However, since the source section just appears to be a wrapper for isochrone and kernel, I'm not sure if this is much better than being ambivalent to the existence of a source section...

sidneymau commented 6 years ago

As far as running the mcmc, this has been resolved by adding band_1 and band_2 specifications to srcmdl.yaml in the isochrone section. Right now, this line will load the isochrone specified by the srcmdl and overwrite the isochrone loaded from the config file. If we want to use non-default isochrone parameters, then these should be specified in both config.yaml and srcmdl.yaml for now.

kadrlica commented 6 years ago

Thanks Sid, this sounds like the underlying issue.

The mcmc.py script allows (too much) flexibility in defining the source model and fit parameters. Looking in mcmc.py I think the order of operations is:

  1. Create the source from the source variable in the mcmc section of the config
  2. If a srcmdl file is provided, then overwrite with the values from the section of the srcmdl file corresponding to the source name.
  3. If coordinates are provided on the command line, then update the source centroid with the coordinates.
  4. If the params variable exists in the mcmc section of the config, then free the listed params in the source.
  5. If opts.grid is set, then initialize the parameter values to the best fit from an initial grid scan.

Each of these pieces of functionality have been useful in the past, since they allow a lot of flexibility in the input parameters. For example:

  1. You can run the mcmc fit without a source model by just providing a set of coordinates.
  2. You can free specific parameters from the config without ever having to touch the srcmdl.
  3. You can fully specify all the details of the srcmdl if desired.

Clearly, this complexity and lack of documentation has been detrimental for users. I think we can deal with this in a few ways.

  1. We can reduce the functionality with the easiest option being that we require a srcmdl in order to fit a candidate.
  2. We can better define and document the order of priority.