Covariance between spectral elements

kboone commented 4 years ago

Is there any interest in supporting covariance matrices between different spectral elements? In sncosmo, we're looking at how we can fit models of supernovae to spectral data, and are interested in basing this off of specutils. One challenge is that we often have covariance information between the different spectral elements that needs to be included in the fit. A discussion of this can be found in sncosmo/sncosmo-eps#3.

At the most basic level, for a Spectrum1D object this involves simply storing a matrix of the covariance between the different spectral elements. It would be nice if the different ways of manipulating spectra could propagate the covariance where appropriate (e.g. resampling). A lot of those operations introduce covariance between spectral elements even if the spectral elements were independent before. I would be happy to help implement this if there is interest.

eteq commented 4 years ago

Definitely of interest, @kboone! The biggest problem right now is understanding exactly how to map this onto the current uncertainty framework. The natural thing as I see it would be to add an uncertainty object that's something like CovarianceMatrix, which does just what you said here and gets provided as the uncertainty at Spectrum1D creation time. The complexity lies in the fact that I'm really not sure what the best convention is for vector Spectrum1D objects (i.e. case 2 on https://specutils.readthedocs.io/en/stable/types_of_spectra.html#overview-of-how-specutils-represents-spectra). There's also some ongoing (slow...) work in Astropy core to come address this problem.

Having said all that, I think your use case might be a great one to try some trial implementations on. Perhaps the best way forward here would be to implement a CovarianceMatrix that explicitly only works on a 1d spectrum? I.e., raises a NotImplementedError if you try to give it to anything other than the scalar Spectrum1D? That gives us the freedom to experiment a bit to fit into the sncosmo use case while not having to solve the whole bigger problem?

hcferguson commented 4 years ago

Is the use case where one has a small covariance matrix that is the same for every pixel -- i.e. just giving the covariance between each (arbitrary) pixel and a few adjacent pixels? This can be implemented as a small array, with dimensions different than the spectrum. But of course this won't help deal with covariances that span a large spectral range.

Or are you needing the covariance of every pixel with every other pixel?

kboone commented 4 years ago

The use case that we are most concerned about is handling calibration uncertainties for a single spectrum. e.g. a zeropoint shift will introduce correlated offsets between all of the different spectral elements. The covariance matrix for this is not sparse... calibration uncertainties often introduce correlations between all of the different spectral elements. There are definitely use cases where you only have covariance between adjacent spectral elements (e.g. resampling a spectrum that was originally uncorrelated), but that isn't true in general.

Case 1 with a single Spectrum1D does seem to be very straightforward to implement with a CovarianceMatrix class for the uncertainty keyword. I agree that things get more complicated for vector Spectrum1D objects. Particularly, there is a question as to whether you want to support covariance across directions other than the spectral axis. That could be necessary for things like IFU spectrographs, but I don't think that there is a simple way of implementing this in a generic way and it will likely be very specific to the application.

For an initial implementation, I think it makes sense to only support covariance matrices along the spectral axis. This addresses most use cases, and is simple even for the Spectrum1D/SpectrumCollection cases 2 and 3 of the link you sent. For a MxN array of spectra with a spectral axis of length N, we can encode the covariance matrices as an MxNxN array where there are M independent covariance matrices for each of the M individual spectra.

I guess that the CovarianceMatrix class needs to be implemented in astropy.nddata.nduncertainty? It seems like this should be relatively straightforward to do, although I'm not sure that I totally understand how to make the propagation work. Is anybody working on this already?

astropy / specutils

Covariance between spectral elements #684