desihub / redrock

Redshift fitting for spectroperfectionism
BSD 3-Clause "New" or "Revised" License
21 stars 13 forks source link

possible bug when mixing NMF and PCA templates in final redshift scans #276

Closed moustakas closed 5 months ago

moustakas commented 5 months ago

@sbailey I think this is a bug but I'm not certain and I haven't dug into the code to assess what might be going amiss:

First, run redrock run with the nominal templates and then with a template set which includes NMF galaxy templates (but otherwise includes the same star and QSO templates):

source /dvs_ro/common/software/desi/desi_environment.sh main

rrdesi -i /global/cfs/cdirs/desi/users/ioannis/fastspecfit/redrock-templates/stacks/redux-templates-NMF-0.1-zscan01/vitiles/coadd-3-80605.fits \
  --targetids 39627688712865504 -o zbest.fits -d zdetails.h5

export RR_TEMPLATE_DIR=/global/cfs/cdirs/desi/users/ioannis/fastspecfit/redrock-templates/stacks/rrtemplates/NMF-0.1
ls -l $RR_TEMPLATE_DIR
total 27696
-rw-rw-r-- 1 ioannis desi 1984320 Jan 29 17:59 rrtemplate-galaxy-NMF-0.1.fits
-rw-rw-r-- 1 ioannis desi  336960 Jul 17  2023 rrtemplate-qso-HIZ.fits
-rw-rw-r-- 1 ioannis desi  290880 Jul 17  2023 rrtemplate-qso-LOZ.fits
-rw-rw-r-- 1 ioannis desi 3268800 Aug 17  2021 rrtemplate-star-A.fits
-rw-rw-r-- 1 ioannis desi 3297600 Aug 17  2021 rrtemplate-star-B.fits
-rw-rw-r-- 1 ioannis desi 1923840 Aug 17  2021 rrtemplate-star-CV.fits
-rw-rw-r-- 1 ioannis desi 3277440 Aug 17  2021 rrtemplate-star-F.fits
-rw-rw-r-- 1 ioannis desi 3248640 Aug 17  2021 rrtemplate-star-G.fits
-rw-rw-r-- 1 ioannis desi 3265920 Aug 17  2021 rrtemplate-star-K.fits
-rw-rw-r-- 1 ioannis desi 3346560 Aug 17  2021 rrtemplate-star-M.fits
-rw-rw-r-- 1 ioannis desi 4008960 Aug 17  2021 rrtemplate-star-WD.fits

rrdesi -i /global/cfs/cdirs/desi/users/ioannis/fastspecfit/redrock-templates/stacks/redux-templates-NMF-0.1-zscan01/vitiles/coadd-3-80605.fits \
  --targetids 39627688712865504 -o zbest-nmf.fits -d zdetails-nmf.h5

Looking at the fitting results, we see that the nominal (PCA) templates like the GALAXY at z=0.16364 (which is the right answer). Meanwhile, the templates which include the NMF galaxy templates yield a STAR spectype at z=-0.0014 with the correct GALAXY redshift as the fourth minimum!

from redrock.results import read_zscan

zs, zf = read_zscan('zdetails.h5')
zf
<Table length=9>
     targetid               z                     zerr          zwarn        chi2        ... spectype subtype ncoeff  znum     deltachi2
      int64              float64                float64         int64      float64       ...   str6     str3  int64  int64      float64
----------------- ---------------------- ---------------------- ----- ------------------ ... -------- ------- ------ ----- ------------------
39627688712865504    0.16364439776838893 1.0766243480836462e-05     0  19214.88787150383 ...   GALAXY             10     0  27842.03577605635
39627688712865504    0.17690234647531164  2.692106748341094e-05     0  47056.92364756018 ...   GALAXY             10     1  6043.541360843927
39627688712865504    0.18493315451662629  3.556348128910181e-05     0 53100.465008404106 ...   GALAXY             10     2 11223.483248092321
39627688712865504 -0.0014753342400076923 1.1470858022826353e-05     0  64323.94825649643 ...     STAR       K      5     3   47841.9517738781
39627688712865504      1.039699278826281 2.6425853489245436e-05     0 112165.90003037453 ...      QSO     LOZ      4     4  2279.647886157036
39627688712865504     1.7249275860747917 0.00011775404198232079     0 114445.54791653156 ...      QSO     HIZ      4     5 312.11266976594925
39627688712865504     0.5302398868198309  3.203875319557626e-05     0 114757.66058629751 ...      QSO     LOZ      4     6 16004.549800656096
39627688712865504 -0.0014325498503955333  6.594297818415482e-06     0 130762.21038695361 ...     STAR       M      5     7                0.0
39627688712865504 -0.0017810087735271878 3.5796049921591294e-06     0 192944.74445472256 ...     STAR       G      5     8                0.0

zs_nmf, zf_nmf = read_zscan('zdetails-nmf.h5')
zf_nmf
<Table length=9>
     targetid               z                     zerr          zwarn        chi2        ... spectype subtype ncoeff  znum     deltachi2
      int64              float64                float64         int64      float64       ...   str6     str3  int64  int64      float64
----------------- ---------------------- ---------------------- ----- ------------------ ... -------- ------- ------ ----- ------------------
39627688712865504 -0.0014753342400076923 1.1470858022826353e-05     0  64323.94825649643 ...     STAR       K      5     0  2071.408624526026
39627688712865504     0.2592844076863871 1.8832468985475416e-05     0  66395.35688102245 ...   GALAXY             10     1 1659.1592923998833
39627688712865504    0.24543019059875015 2.5085768138572075e-05     0  68054.51617342234 ...   GALAXY             10     2 1106.9018910750747
39627688712865504    0.16378398397398541 2.0080150217357837e-05     0  69161.41806449741 ...   GALAXY             10     3 43004.481965877116
39627688712865504      1.039699278826281 2.6425853489245436e-05     0 112165.90003037453 ...      QSO     LOZ      4     4  2279.647886157036
39627688712865504     1.7249275860747917 0.00011775404198232079     0 114445.54791653156 ...      QSO     HIZ      4     5 312.11266976594925
39627688712865504     0.5302398868198309  3.203875319557626e-05     0 114757.66058629751 ...      QSO     LOZ      4     6 16004.549800656096
39627688712865504 -0.0014325498503955333  6.594297818415482e-06     0 130762.21038695361 ...     STAR       M      5     7                0.0
39627688712865504 -0.0017810087735271878 3.5796049921591294e-06     0 192944.74445472256 ...     STAR       G      5     8                0.0

However, the NMF fitting results don't match the redshift scan!

import numpy as np
import matplotlib.pyplot as plt

targetid = 39627688712865504
redshifts = zs_nmf[targetid]['STAR:::K']['redshifts']
zchi2 = zs_nmf[targetid]['STAR:::K']['zchi2']
print(redshifts[np.argmin(zchi2)])
0.0020000000000000104

plt.plot(redshifts, zchi2)

image

So something is amiss. Any thoughts?

Tagging @dylanagreen as well since he's messing around with NMF fitting, too.

dylanagreen commented 5 months ago

I reproduced these on the same file/targetid using PCA galaxy templates and PCA/NMF QSO templates, but are we sure this is a bug with NMF mixing? The STAR Chi^2 scan is the same regardless of Galaxy/QSO template type:

Screen Shot 2024-01-31 at 9 29 59 AM

And in both cases @moustakas presents above the reported minimum for the STAR_K template is -0.0014 and in both cases the star zscan is identical, with the same chi^2 value. So changing the template types doesn't change the STAR zscan (as expected) but did (accidentally) illuminate a potential problem with the minimum in the stellar template being off from what the zscan would indicate.

Chasing the rabbit hole a bit further I ran with only the STAR-K template, and it reported two minima in the zscan, the aforementinoed -0.0014 and the expected one at 0.002 with the expected ZWARN that it is at the edge of the zscan range (1056 = 1024 + 32, bits 5 and 10 i.e. Z_FITLIMIT and BAD_MINFIT).

      int64              float64                float64         int64      float64       ...   str4     str1  int64  int64  float64 
----------------- ---------------------- ---------------------- ----- ------------------ ... -------- ------- ------ ----- ---------
39627688712865504 -0.0014753342400076923 1.1470858022826353e-05     0  64323.94825649643 ...     STAR       K      5     0       0.0
39627688712865504  0.0020000000000000104 1.4543807701833745e-05  1056 62445.457767245876 ...     STAR       K      5     1       0.0

I think the more concerning bug(?) here is that there's clearly better minima in the STAR scan that are neither of the two redshifts reported (around 0.00075) but for some reason the scan is focusing on the one at -0.0014. I am not proficient enough in running redrock to chase this further, I tried to force redrock to scan more minima with --nminima 50 but it didn't change the above results.

Short summary: Based on the reported chi^2 values, NMF is just missing this galaxy. Unlikely that this is a bug in the NMF/PCA mix, although there seems to be entirely separate issues with the stellar template zscan that I have no intuition on.

sbailey commented 5 months ago

Two issues:

Comparing the PCA and NMF galaxy chi2 vs. z scans (upper plot), both have a local minimum near z~0.16, but the NMF chi2 is considerably worse. Looking at the data and reconstructed templates (lower plot), both PCA and NMF are struggling to model a galaxy that red, but PCA appears to have more flexibility to absorb the continuum and thus gets a much better chi2, allowing that local minimum to win.

image

See code at $CFS/desi/users/sjbailey/debug/redrock276/plotfits.py; it's a bit too long to paste here.

This is consistent with what @dylanagreen has been seeing with QSO NMF templates: the broad chi2 vs. z shape is set by how well the templates can fit the continuum, and the depth of local minima is set by how well they can model the lines. With a similar number of templates for NMF vs. PCA, NMF has less broadband flexibility and thus is failing to model the overall color as well.

Conclusion:

moustakas commented 5 months ago

Thanks to both @sbailey and @dylanagreen.

@sbailey regarding your plot which shows the data and both the PCA and NMF model spectra, both those fits are complete crap. I agree it's quite dusty, but that's not a particularly "hard" spectrum to model. E.g., here's the FastSpecFit model fit-- https://fastspecfit.desi.lbl.gov/target/sv1-dark-17691-39627688712865504

So what this tells me is that there are too few NMF (and possibly PCA) components... This is helpful...

image

sbailey commented 5 months ago

In the end this wasn't a bug in mixing NMF and PCA and the action times from debugging this are spawned off to other tickets (#277 zmin neighbor exclusion for stars; #279 and #280 improved templates). Closing.