desihub / redrock

Redshift fitting for spectroperfectionism
BSD 3-Clause "New" or "Revised" License
21 stars 13 forks source link

problematic QSO / GALAXY fit #281

Closed moustakas closed 4 months ago

moustakas commented 4 months ago

At the data systems call today I presented this oddball. Redrock ranks a QSO model as the top two fits, both at the wrong (VI) redshift, but the sharpness of the chi^2 scan at the minimum suggests that something is amiss. The GALAXY fit (shown in gray) is at the right redshift, looks like a clearly better fit, but is ranked as the third-best minimum.

image

To reproduce:

rrdesi -i /global/cfs/cdirs/desi/users/ioannis/fastspecfit/redrock-templates/stacks/redux-templates-NMF-0.2-zscan01/vitiles/coadd-4-80613.fits \
  --targetids 39633348334715174 -o zbest.fits -d zdetails.h5 \
  -t /global/cfs/cdirs/desi/users/ioannis/fastspecfit/redrock-templates/stacks/rrtemplates/NMF-0.2 

And then (third minimum is the correct one):

from redrock.results import read_zscan
zs, zf = read_zscan('zdetails.h5')
zf
<Table length=9>
     targetid                z                     zerr          zwarn        chi2                               zz                       ... npixels spectype subtype ncoeff  znum     deltachi2
      int64               float64                float64         int64      float64                         float64[15]                   ...  int64    str6     str3  int64  int64      float64
----------------- ----------------------- ---------------------- ----- ------------------ ----------------------------------------------- ... ------- -------- ------- ------ ----- ------------------
39633348334715174      0.6854235253876094  6.435125524024464e-06     0  93344.39492797852        0.6834076602245445 .. 0.6872883156635221 ...    7925      QSO     LOZ      4     0 12612.389946937561
39633348334715174     0.28587434510372395  6.537056198330875e-06     0 105956.78487491608       0.2843674750024283 .. 0.28732824781181976 ...    7925      QSO     LOZ      4     1   6630.57647138834
39633348334715174      0.1497488356821888   8.95398520026666e-06     0 112587.36134630442      0.14874353633682325 .. 0.15033168098113836 ...    7925   GALAXY             40     2 2468.6016489243048
39633348334715174     0.15001886380785487 1.0588134929353917e-05     0 114171.00924909115      0.14865418458569857 .. 0.15130210595034432 ...    7925      QSO     LOZ      4     3   884.953746137573
39633348334715174 -0.00011648020537089704 3.7663491164321733e-06     0 115055.96299522872 -0.00015999999999999522 .. -7.9999999999995e-05 ...    7925     STAR       F      5     4 1561.0180569315417
39633348334715174 -0.00011806819387766023  3.601504140691688e-06     0 115147.43617168834 -0.00015999999999999522 .. -7.9999999999995e-05 ...    7925     STAR       G      5     5   1469.54488047192
39633348334715174      0.5070358450783161 1.3812901004104455e-05     0 116616.98105216026        0.5059934421940269 .. 0.5080754869184927 ...    7925   GALAXY             40     6   94.8396813510335
39633348334715174 -0.00011633057702154138  4.574228299972899e-06     0  116711.8207335113 -0.00015999999999999522 .. -7.9999999999995e-05 ...    7925     STAR       A      5     7  946.8167734374874
39633348334715174      0.1735644558032477  6.366069496624887e-05     0 117658.63750694878       0.17279763290025074 .. 0.1744190324734709 ...    7925   GALAXY             40     8                0.0
sbailey commented 4 months ago

This bad fit is dominated by an unmasked large negative spike in the data around 8440 Angstroms, with a claimed abs(S/N) > 200, contributing ~15% to the overall sum((S/N)^2). The QSO PCA template "wins" because it fits that bogus highly significant line with negative [OIII]. The correct galaxy fit loses because it gets a big chi2 hit from not modeling that bogus line.

The "real" fix is to address this upstream so that this feature would be masked in the first place. Physicality constraints in both galaxy and QSO templates would help on the Redrock side. Another option could be to mask any wavelengths that are huge outliers from all of the template fits, and then re-fit. That would be computationally expensive, so proceed with caution...

moustakas commented 4 months ago

I was focused on the ivar spectrum and didn't check for negative pixel values. Nice catch.

I wonder if Redrock shouldn't internally mask high S/N negative pixels during fitting. Negative pixel values are fine but they shouldn't have S/N>>10.

We're throwing away a perfectly good redshift here by assuming the reductions are perfect.

sbailey commented 4 months ago

For the record: the extremely negative spike was coming from a bogus feature in a DARK calibration image desi_spectro_calib/0.4.0/ccd/dark-sm1-z4-20200730.fits.gz used for z4 data from 20201214 through 20201223. After that, we used a different dark image desi_spectro_dark/v2209/dark_frames/dark-sm1-z4-20201228.fits.gz which did not have the problem pixels, and more recently @Waelthus added a new dark desi_spectro_dark/v2209/dark_frames/dark-sm1-b4-20201214.fits.gz which also fixes this problem for the December 2020 data.

Masking large statistically "significant" negative values could still be valuable for Redrock and/or desispec, but this particular case is already fixed for future data runs.

Curiously, in the iron production, these feature was sometimes masked as a cosmic (despite being negative). image

moustakas commented 4 months ago

Potentially related ticket-- https://github.com/desihub/desispec/issues/2152#issuecomment-1879933452

moustakas commented 4 months ago

Fixed by #282, although I feel that we should make the --negflux-nsig masking the default for Jura and beyond.

sbailey commented 3 months ago

To be clear: rrdesi ... --negflux-nsig 5 (5 sigma negative flux masking) is the default, i.e. that is what you get if you don't specify the option at all. You only have to specify that option if you want to change the sigma threshold to something else.