emission-line amplitudes are modified by resampling and convolution in unexpected ways

moustakas commented 1 year ago

Work on #137 has led to the discovery of a significant bug which impacts the reported emission-line amplitudes in a way which depends on the line-width (and therefore the error can be unexpectedly large for very narrow emission lines).

In retrospect, the issue should have been obvious, but the bug has been in place since the first versions of the code.

This is what I currently do in the emission-line fitting portion of the code:

Initialize each emission line as a Gaussian profile of a given line-center (in km/s, relative to the Redrock redshift), intrinsic line-width (in km/s), and peak amplitude;
Build the redshifted model emission-line spectrum with constant-velocity pixels at much higher sampling resolution than DESI spectroscopy (currently $\Delta v$=5 km/s);
Resample the model spectrum to match the extracted DESI wavelength vector (here I use the sample flux-conserving trapezoidal resampling algorithm used by Redrock);
Take the dot product of the data with the resolution matrix in order to account for the line-spread function;
Use non-linear least-squares to optimize the parameters of each emission line.

The problem is that the model emission-line amplitude (which I report as an output parameter, e.g., OII_3726_AMP) is the value of the initial model, but resampling and convolution change the amplitude in a way that depends on the line-width, as illustrated in the figure below.

In this figure I show a hypothetical single-line galaxy with a fixed intrinsic (input) peak-amplitude of 5.0 (arbitrary units) and varying line-width between 100 km/s and 5 km/s. I use a constant model pixel size of $\Delta v$=5.0 km/s and I show the initial model (blue squares) and the effect of resampling (orange points) and convolution (green xs). In each panel I also report the input theoretical flux, $A\sqrt{2\pi}\sigma\lambda_{z}/c$, where $A$ is the line-amplitude, $\sigma$ is the line-width (in km/s), $\lambda_z$ is the redshifted line-wavelength (in Angstrom), and $c$ is the speed of light (in km/s); it is this flux that is reported in the output tables (e.g., OII_3726_FLUX). The legend then shows the percent error in the flux after each step in the algorithm.

At the moment I am uncertain how to proceed, so suggestions from anyone are welcome.

moustakas commented 1 year ago

For the record there's no data in the figure above, but I am using a real resolution matrix from TARGETID=39627835538672534 on $DESI_ROOT/spectro/redux/iron/healpix/main/dark/273/27333/coadd-main-dark-27333.fits (see https://github.com/desihub/fastspecfit/pull/137#issuecomment-1650280434).

And the hypothetical emission line I'm investigating is [OIII] 4363 at a redshifted (z_RR=0.0594807) wavelength of 4624.0348. In the data this line should have an amplitude of ~0.4, but the code reports a totally incorrect amplitude of ~3.8, which is how this issue was discovered. Note that the line-width for this object is ~4.0 km/s.

moustakas commented 1 year ago

This issue was a conceptual, not a computational error.

In essence, the (emission-line) velocity width and amplitude are measured in model space, before resampling and convolution. This is what we want for the line-width, because we want to know the intrinsic width of the line. However, for the purposes of determining if a line is significant (relative to the scatter in the continuum pixels), we want the observed line-amplitude, after resampling and convolution.

In #137 we address this issue by reporting the model amplitude in a new {LINENAME}_MODELAMP parameter and measuring the observed amplitude and reporting it in the {LINENAME}_AMP column of the data model (and so this change should not impact any existing code, which should continue to use {LINENAME}_AMP*np.sqrt({LINENAME}_AMP_IVAR} to assess the significance of an emission line.

Closing.

desihub / fastspecfit

emission-line amplitudes are modified by resampling and convolution in unexpected ways #139