Reasonable values for sigma-contours plot

astro-catherine commented 1 year ago

I am trying to fit some high-resolution data, and I can't get the sigma-contours plot to generate properly. When using global_cov=dict(log_amp=-11, log_ls=2), the plot below is generated. Do you recommend other values for log_amp and log_ls? Do you recommend using the local_cov parameters in addition/instead?

gully commented 1 year ago

I recommend spot-checking draws or samples from your GP-- depending how the contours are created they can mask the underlying smoothness (or not) of the GP.

Are your observed spectra in normalized or absolute flux units? The scale length of $10^{-11}$ seems much smaller than the typical variations in your spectrum. My gut instinct would recommend initializing with a value for log_amp closer to -2. The log_ls=2 would imply scale lengths of ~100 Angstroms, which is comparable to your bandwidth, so the GP appears to act primarily as a continuum normalization in that scenario. That's an appropriate and fine use of a GP, especially if you have Polynomial fitting turned off. The original spirit of the Global Covariance Kernel is to capture residual structure at the line-by-line residual level of ~1 Å, so a log_ls=0 or log_ls=-0.5 or something might get closer to the goal, depending on your resolution and vsini, etc.

In my experience, tuning the GP is among the most subtle aspects for getting started with Starfish and ultimately getting plausible posteriors and achieving convergence. The correct initial values depend strongly on your units and resolution so it often does not work to accept the default values. I could imagine a devoted tutorial just on this topic, since I've seen many get tripped up on it. It's not a Starfish bug per se, so my guidance is to build intuition for tuning GP's in other contexts. Here is a code-free website that helps visualize why you may be seeing contours with smooth shapes, depending on the length scale and amplitude:

https://edward-rees.com/gp (you have to click to make data points in the bottom panel for the GP to fit through the fake data points)

Which tool are you using to make the GP contours/draws? I recommend double checking if you are plotting variance or standard deviation, since different frameworks provide one or the other (the covariance matrix expects variance-like units, the contours expect stdev-like units). You can verify which statistic is returned by spot-checking the docstring of whichever tool you use to generate the GP draws (or contours).

Somewhere in the tutorials or GitHub Issues there should be demos of plotting samples from the covariance matrix. It would look something like this: y_draw = np.random.multivariate_normal(np.zeros_like(y_obs), cov_matrix)

where cov_matrix is your $N{pix} \times N{pix}$ covariance matrix.

astro-catherine commented 1 year ago

These spectra are normalized. I tried plotting them with log_amp=-2 and log_ls=0, and the sigma-contours remained flat.

I will check out the website you recommended, as well as search the Starfish tutorials/GitHub Issues for the covariance matrix sample plot. Thanks so much for these recommendations! I'll comment on this thread with any updates.

gully commented 1 year ago

I think this is the line of code that generates your contours:

https://github.com/Starfish-develop/Starfish/blob/6b20086f85ed92332543a2b30dfd56e851d593a4/Starfish/models/spectrum_model.py#L756

It appears that the model.plot() function is merely plotting the diagonal of the covariance matrix, not a conditional GP.

I think what you have in mind is the conditional GP, such as this slide or this tutorial. Is that correct?

Otherwise, decreasing your errorbars should decrease the contour bands.

gully commented 1 year ago

Here is where the covariance matrix diagonal gets populated: https://github.com/Starfish-develop/Starfish/blob/6b20086f85ed92332543a2b30dfd56e851d593a4/Starfish/models/spectrum_model.py#L338

The cov.diagonal term may be boosting the apparent uncertainty. But where does it come from?

Usually the diagonal of the covariance matrix should be identical to the input variance (the vector of $\sigma_i^2$ values). However the emulator produces a covariance matrix that I think of as the "discretization error matrix", the uncertainty attributable to interpolating the grid. In principle this covariance matrix could have on-diagonal contributions. Whether or not it should have on-diagonal terms may be a matter of taste, I assume they arose from the addition of the kriging terms (the difference between the submitted and accepted version of the paper). In any case, I would recommend spot-checking and general quality assurance of the various terms that enter into the covariance matrix to see if they make sense. For example, which term is dominant, the variance of the data or the diagonal of the discretization error matrix?

The fact that the diagonal of the reconstruction matrix appears dominant may be a symptom of some other sickness. It fundamentally means that the pixel-to-pixel uncertainty in the reconstruction is larger than the uncertainty in your data, which should not be true. Other design choices like the extents of the grid you are training on could matter, as well as how many terms in the PCA you are retaining. The key idea is to build an intuition for what your GP covariance matrix should look like, and what it actually looks like, by surgically dissecting the terms that are going into it (discretization matrix, data variance, and global kernels). I would recommend plotting the heatmap of the matrix, like the in the original Czekala et al. paper. Plotting the full 2000 x 2000 heatmap can overwhelm matplotlib, so maybe a subimage would help to start with.

Starfish-develop / Starfish

Reasonable values for sigma-contours plot #154