Starfish-develop / Starfish

Tools for Flexible Spectroscopic Inference
https://starfish.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
68 stars 22 forks source link

Roadmap for Chunked Inference Approach #74

Open kevinkhu opened 7 years ago

kevinkhu commented 7 years ago

This will serve as a roadmap for the implementation of the "chunking" approach, which will eventually become a major mode of operation for Starfish. The rationale for doing this is that currently, fitting of spectral model grids (for main sequence stars at high resolution, cool stars at high and low resolution, and exoplanet spectra in general) is generally a systematics-dominated problem. This means that there are wavelength regions of the data spectrum for which the model grids cannot produce an accurate model. This also implies that if our main goal is inference of accurate stellar parameters, then a main focus will need to be a "calibration" of these systematic effects. Note that in some sense this roadmap supersedes that of #58, although the ideas in that roadmap are mostly complementary to those presented here.

Instead of fitting the full spectrum and downweighting discrepant regions (as in "classic" Starfish), we propose to fit individual chunks of the spectrum at a time. Chunking allows us to compare smaller regions of spectra to models and identify more easily where models are inaccurate.

The chunking approach will function by segmenting the spectrum into independent regions, where spectral inference to determine the fundamental stellar properties (Teff, log g, [Fe/H], etc...) are done on each chunk, independently. In an obvious sense, this violates much of what we know about stellar astrophysics, i.e., the emergent spectrum is the realization of complex stellar astrophysics and each spectral line is by no mean physically independent from the others. However, since we are dealing with strong model systematics (e.g., some spectral lines simply do not fit the data for any combination of Teff, log g, [Fe/H]), this approach allows us to get a better lay of the land, and provides a groundwork for exploring which regions of the spectrum we can trust and which ones we should be skeptical of.

There are a few tasks that need to be addressed in order to implement this approach.

Setup and initialization

First, the user should be able to take a model grid, a data spectrum, and a list of chunk wavelength boundaries, and then run some scripts to segment the data up into individual chunks. The idea is that the inference on each chunk can be done completely independently from any other chunk, and so the scripts should be organized to run with that in mind. Once the posterior for each chunk is delivered, however, we will want tools that can pull the posteriors from each directory and plot them.

There are a few considerations to take care of here. The individual chunk directories will be labeled by chunkID_wlstart_wlend appropriately zero-padded so that there are no conflicts when using typically sized chunks from optical to infrared wavelengths.

Also, we need an easy way to regenerate the sub-directories if the chunk wavelength boundaries change. There should also be a way to select and individual sub-directory and regenerate just that. For these reasons, we are thinking that an individual Makefile within each subdirectory might be the best option.

Tasks within each sub-directory

Are there any necessary changes that need to be made for the emulator? Currently nothing major comes to mind, but I could be forgetting something.

Necessary improvements to star_chunk.py

Note that these mini-tasks can also be launched en mass by a top-level bash script.

Inference

gully commented 7 years ago

Below is an illustrative figure of the types of inferences we will be able to achieve in the spectral chunking strategy.

These are posterior samples of T_hot and logg from LkCa 4 from Gully-Santiago et al. 2017 from Starfish fits to IGRINS spectral orders m = 101 (Blue kernel density estimate), m = 114 (Red KDE), and m=117 (Green KDE).

temp_logg_example