Finite differences idea for Sonora family of models: abundance Jacobians

gully commented 2 years ago

I've been talking with @astrocaroline about an idea for augmenting the Sonora models with some cheap-and-useful ancillary information. Here is a sketch of the idea we have:

An opportunity: finite difference Jacobians for free cheap

Caroline explains that the computation of a single Sonora spectrum gets bottlenecked at the stage that computes physical self-consistency: assembling the T-P profile amidst all the opacitiy sources, etc. However, once that righteous $T-P$ profile has been obtained, perturbations to the underlying assumptions can be computed very cheaply. Examples of such perturbations include tweaking the abundances, for example.

These perturbations can be made small enough that the underlying assumptions that arrived at the $T-P$ profile are close to correct. Under such limiting cases the perturbed output spectra represent a Taylor Series expansion on the grid point. We can then obtain the Jacobian by taking finite differences, say between the original bona-fide grid point, and this new one:

$\lim{h \to 0} \frac{F\nu([H2O]+h) - F\nu([H2O])}{h} \approx \frac{\partial F\nu}{\partial \mathrm{[H_2O]}} $

where $h = \delta [H_2O]$ represents a small perturbation to the water abundance. So this equation represents the change in the emergent spectrum from perturbing the water abundance, at the native $R \sim 1,000,000$ spectral resolution of the precomputed synthetic spectrum!

The problem: how to deal with them?

Such Jacobian information is not currently packaged with the model outputs, and it would be tricky for a consumer to obtain an estimate without reverse-engineering aspects of the water line list that were already in-hand at the time of model convergence. So why don't the model owners compute these Jacobians, package them with the models, and publish them?

At least one reason is because it's not clear how folks would use them: they are a non-standard model output, and there apparently has not been any demand for them.

A `gollum`-based strategy for handling Jacobians

gollum provides a potentially new way to liberate these outputs, if they existed in the first place. For example, you could imagine an extension to the dashboard for water abundance that allows the user to move a slider for water abundance, restricted to in a small range of validity near the grid point. While imperfect, this visualization guide would let the user know whether certain features are attributable to water or not. Some heuristics for such a visualization exist, but mostly for low resolution spectra: essentially "water is this big bandhead". But as we move towards high resolution spectra, the heritage of any particular line or group of lines becomes much more difficult to interrogate: lines overlap and shift and ebb and flow. So I envision this tool as primarily unlocking new use cases at high spectral resolution.

There is another benefit: These Jacobians could be used to compute "Information Content" analyses, by taking the dot-product of Jacobians from different physical perturbations (each doctored with the same resolution kernels and sampling). That's a formalized way to answer questions like "to what extent is H2O abundance degenerate with FeH in my wavelength range of interest?". For some use cases this information content analysis could lessen the demand for the much-more expensive MCMC "retrieval" analysis that currently achieves similar aims. It could make it easier to write JWST proposals that assess the tradeoffs among instrument modes, for example, achieving better proposals and better overall resource allocation.

Hypothetically there may be a way to obtain finite-difference Hessians, by perturbing pairs of parameters, but I've though much less about that. I suppose that's just to say that there are even more spin-off technologies that could arise by building a workable prototype around these ideas.

How to actually generate the products, and who should do it?

One key idea is that it would involve making new model outputs. To date gollum has taken the models as given, precomputed text files stored on the web. I think it is beyond the purview of gollum to generate new model products. So likely the generating code would live in a separate repo, and then the products would get consumed with new gollum code. So there is code development in both of those places.

Who wants to work on this? Thoughts?

astrocaroline commented 2 years ago

I think this is a really cool idea and someone should work on it. 👍

My thermal emission code actually has a function for changing the molecular abundances individually (i.e. multiplying water abundance by some fraction) so I think this would be very easy to run the models for, and I'm happy to hand the code off for someone to run it!

gully commented 1 year ago

One subtlety I didn't emphasize here-- The Jacobians as described above are missing some expansion terms that can sometimes enter as leading order: let's stay there are $N_{layers}=100$ temperature layers $T_i$, evaluated at fixed Pressure coordinates $P_i$ that make up the $T-P$ profile. The change in water abundance affects those temperature layers. So the full first-order expansion looks like:

$\lim{h \to 0} \frac{F\nu([H2O]+h) - F\nu([H2O])}{h} =\frac{\partial F\nu}{\partial \mathrm{[H2O]}} \approx \frac{\partial F\nu}{\partial \sigma} \cdot \frac{\partial \sigma}{\partial \mathrm{[H_2O]}} + \sumi \frac{\partial F\nu}{\partial T_i} \cdot \frac{\partial T_i}{\partial \mathrm{[H_2O]}} + \cdots$

where $\sigma$ represents the total opacity/cross section of all species; the terms involving $\sigma$ represent how the spectrum changes if you change the opacities and everything else is held fixed (i.e. not solved for self-consistently). The terms involving $T_i$ account for how the thermal structure would respond to such a change in opacity.

Calculating the precise direction and extent of the thermal structure response is exactly the computationally expensive part we are trying to avoid. So the $\frac{\partial T_i}{\partial \mathrm{[H_2O]}}$ term needs some approximate treatment. There are a few choices:

1) just ignore those terms entirely, treating them as negligible $\frac{\partial T_i}{\partial \mathrm{[H_2O]}} = 0$ 2) guess a flux-weighted scalar correction: $\frac{\partial T_i}{\partial \mathrm{[H_2O]}} = \epsilon$ where $\epsilon$ is approximated through heuristics 3) leave it as a tunable parameter

The right thing to do depends on your science case, operating wavelengths, precision demands, etc.

BrownDwarf / gollum