Using flux-calibrated models instead of normalized models.

gully commented 8 years ago

The problem: If we implement the mixture model feature, we will probably need to use flux-calibrated model spectra; in other words, the non-normalized spectra natively produced by the model grid (i.e. Phoenix provides real flux units for its models). The reason for needing to use flux-calibrated models should be clear- the relative flux of the mixture model components should scale with different guesses for the effective temperature. There are a few problems that could arise. First, we'll need to toggle on-and-off the normalization whether you're in mixture model or not (that's easy enough). Second, the PCA procedure in the spectral emulation step might acquire more--or at least different--eigenspectra, since the dominant variance will now come from the absolute flux level and not the spectral lines. (I haven't fully fleshed this out, but I suspect the default of applying normalization is there for a reason.) Lastly, the logOmega term might take on a different meaning, or at least different absolute values.

Suggested solution Just experiment with it and see how it performs. :)

gully commented 8 years ago

OK, I fleshed out some of the details here. We definitely don't want to use the non-normalized (i.e. raw) model grids in the spectral emulation process, for the reasons I speculated about above.

The right thing to do is probably to construct a scalar flux ratio correction factor as a function of the model grid labels (Teff, log g, [Fe/H]). We can do this in the following way: compute the "grid mean flux" for each model spectrum, for each m spectral orders. Then perform regression (e.g. Gaussian Process regression) to the absolute flux as a function of the stellar parameters. The GP regression strategy would be similar to the Spectral Emulator portion of the code, but since the flux mean is just a single scalar, we don't need dimensionality reduction (PCA), making the regression fairly straightforward, I think.

The implementation would be as follows:

Use grid.py --create to make a normalized model grid, as usual
A new function, e.g grid.py --create --absolute_fluxes or whatever, would be identical to grid.py --create, but would have norm=False, creating an absolute flux model grid.
A new function similar to pca.py would then perform the regression on the flux ratios.
In the evaluate step in parallel.py, you do: _M_mix = (1-c) M_1 + c * q_m M2, where _M1 is stellar component 1, _M2 is stellar component 2, c is the filling factor to be determined, and _qm is the flux ratio evaluated at the stellar labels for _M1 and _M2.

I'd have to think a bit more about the error propagation, but I think it's straight-forward-- it's just a scalar multiple.

iancze commented 8 years ago

You're right about the need to flux-standardize (not continuum-normalize) the input grid, this is to remove the largest source of variance in the models so that the emulation is more accurate.

Since there are many grids that (helpfully) provide spectra with real units, like PHOENIX, it would be very nice to propagate this information so that it may be used in mixture model applications. I think your suggestion of keeping a separate flux coefficient sounds reasonable. My naive guess is that this function might be smooth enough (in log(coefficient) space) that a simple linear interpolator might work well enough without resorting to GPs, but we would need to test that assumption.

gully commented 7 years ago

Ok, we figured this out after some deliberation. Note the portion of the code in grid_tools.py

        #If we want to normalize the spectra, we must do it now since later we won't have the full EM range
        if self.norm:
            f *= 1e-8 #convert from erg/cm^2/s/cm to erg/cm^2/s/A
            F_bol = trapz(f, self.wl_full)
            f = f * (C.F_sun / F_bol) #bolometric luminosity is always 1 L_sun

There's no need to normalize the spectra in the last line above. From the Husser et al. 2013 paper:

The files always contain one single primary extension, which holds the flux of the spectrum in units of [erg/s/cm2/cm] on the stellar surface.

on the stellar surface means they've integrated over a stellar disk filling half the sky, in other words, they've integrated over solid angle:

$\int d\Omega\, \cos\theta = \int_0^{2\pi}d\phi \int_0^{\pi/2}d\theta\, \sin\theta \cos\theta = \pi.$

Which yields a factor of $\pi$. Makes sense!

So what we should do for the code is:

Keep the per Angstrom (not per cm), since most of the code assumes angstroms for wavelengths. This doesn't make much difference either way but will just keep things simple.
Divide the spectrum by a factor of Pi. This will yield flux units of erg/cm^2/s/A/steradian, undoing the "at the stellar surface" bit. Then the product of solid angle and the flux will be back in units of the observed spectrum erg/cm^2/s/A. This is a subtle point, but will be useful when comparing to other emission sources that are correctly calibrated (e.g. Black Bodies).
Finally, the F_bol_interpolater I made up for the mixture model should be eliminated completely.

However, the flux standardized spectra in the emulator step should not be removed, because this standardization serves a separate purpose. I will post about this later in more detail.

gully commented 7 years ago

Short update on the flux standardization. I'm tracking scalar values in front of the emulated spectra so that we can maintain absolute fluxes. In PCAGrid.create in emulator.py:

        # Normalize all of the fluxes to an average value of 1
        # In order to remove uninteresting correlations
        flux_scalars = np.average(fluxes, axis=1)[np.newaxis].T
        fluxes = fluxes/flux_scalars

I then add instances of flux_scalars throughout the rest of emulator.py, mimicking flux_mean (and the eigenspectra weights). The goal is to interpolate a single value flux_scalar at an arbitrary grid point. For the reasons outlined in this Issue thread, it's probably safe to assume that the scalars are smooth as a function of stellar parameters, although this assumption will eventually break down if the chunk sizes are too small.

gully commented 7 years ago

Demo of flux_scalars over a grid for a range in Teff (shown), and small range in logg and [Fe/H] for a user-input wavelength segment and resolution:

(Don't read too much into these values-- they have been normalized with the deprecated system discussed above). The flux scalars should (generally) increase with Teff in a revised system.

gully commented 7 years ago

Here is a commit that implements the flux scalars approach in emulator.py on my fork: 7b2afc7

This code breaks backwards compatibility, since it inserts a new positional argument that is not expected in the existing methods. This commit is my current work-around, and by no means a recommendation for the path forward, but perhaps a good starting point.

Starfish-develop / Starfish

Using flux-calibrated models instead of normalized models. #38