desihub / fastspecfit

Fast spectral synthesis and emission-line fitting of DESI spectra.
https://fastspecfit.readthedocs.org
BSD 3-Clause "New" or "Revised" License
13 stars 2 forks source link

ensure the correct QSO redshifts are being used #149

Closed moustakas closed 1 year ago

moustakas commented 1 year ago

@ashleyjross @paulmartini @geordie666 @stephjuneau @VFawcett

Before merging #148, I'd like to confirm that I'm using the correct QSO redshifts. My current procedure is documented here, which is supposed to mimic the choices made when generating the official QSO catalog-- https://fastspecfit.readthedocs.io/en/latest/vacs.html#updated-qso-redshifts https://github.com/desihub/fastspecfit/blob/main/py/fastspecfit/io.py#L843-L873

However, I'm concerned that I'm missing some subtleties.

E.g. there are 1151 objects in iron where I appear to be using the (incorrect) Redrock redshift, Z_RR, but Z_RR<1e-3, and so the object never appears in the FastSpecFit VAC for Iron.

import numpy as np
import fitsio
from astropy.table import Table

tt = Table(fitsio.read('/global/cfs/cdirs/desi/spectro/fastspecfit/iron/catalogs/fastspec-iron-main-dark.fits', 'METADATA'))
jj = Table(fitsio.read('/global/cfs/cdirs/desi/survey/catalogs//Y1/QSO/iron/QSO_cat_iron_main_dark_healpix_v0.fits'))

miss = jj[np.logical_not(np.isin(jj['TARGETID'], tt['TARGETID']))]['TARGETID', 'DESI_TARGET', 'SURVEY', 'PROGRAM', 'HPXPIXEL', 'Z', 'Z_RR', 'Z_QN']
miss
<Table length=1151>
     TARGETID         DESI_TARGET     SURVEY PROGRAM HPXPIXEL         Z                Z_RR         Z_QN
      int64              int64         str4    str4   int64        float64           float32      float32
----------------- ------------------- ------ ------- -------- ------------------ --------------- ---------
39627357253796173              262148   main    dark    36545  2.853520698300789   0.00020393773 2.8347204
39627375029262340              917542   main    dark    36191  2.530053022991597   -7.260663e-05 2.5582354
39627375067009318              262148   main    dark    22537 1.5922189576381058   0.00010028701 1.5808651
39627380221813460 4611686018427650052   main    dark    36537   1.08083928464285   0.00085696904 1.0782522
39627391559012147 4611686018427650052   main    dark    16837 1.3491390560448964   -6.837142e-05 1.3475368
39627392200740123 4611686018427650052   main    dark    36320 1.0291348651820496     0.000146363 1.0388956
39627397368125143 4611686018427715589   main    dark    16848 1.2527383651595285  -0.00046188416 1.2826067
39627398177627447              262148   main    dark    22560 1.4673898209182226   -7.343006e-05 1.4840186
39627409309308355 4611686018427650052   main    dark    36557 1.1745866435460977   -0.0015088934 1.1646299
39627409766487786              262148   main    dark    36221 1.3379025697051008     3.47225e-05 1.3364595
39627426464009140              262148   main    dark    16857 2.2585268726370686   -0.0009436496 2.2664669
39627432948405274              917542   main    dark    36329 1.9303214777590592  0.000109551205 1.9266946
39627438891735836              262148   main    dark    36307 1.7733256074481123    7.094293e-05 1.7815315
39627438946256010              917542   main    dark    22570 2.8231088542152305   8.2937884e-05 2.8338919
39627438992396336 4611686018427650052   main    dark    22566 2.8187862111476365   0.00013235913 2.8133914
39627444604377223 4611686018427650052   main    dark    36629 1.2477833320407106    -0.001611617 1.2454591
39627450539316086 4611686018427650052   main    dark    36327 1.3270181950012672    7.193237e-05  1.344643
              ...                 ...    ...     ...      ...                ...             ...       ...
39633466018500952                3622   main    dark    16064 1.4722533128273234   -5.043559e-05 1.5081482
39633468308587614 4611686018427388932   main    dark    16028  1.227129658168781   0.00062727125 1.2419482
39633468342145202                1028   main    dark    16064 1.3035021109696598   -7.647194e-05 1.3045537
39633471982798225                3622   main    dark     7598  1.876178579387708  -9.7424425e-05 1.8776968
39633475187247820                1028   main    dark    16066 1.3479028391146826 -0.000107706255 1.3354051
39633478685296906                1028   main    dark     7939 2.6122712371659333   -0.0034994304 2.6171224
39633503385551649                1028   main    dark     7951 1.3503309800547376  -0.00053241465 1.3334382
39636270304992132              262148   main    dark    22575 3.1619751284990287   1.6172382e-05 3.1827903
39636282036459123 2305843009213956100   main    dark    22665 1.5699769354036364    0.0008147801 1.5818841
39636347434041900              262148   main    dark    22630 2.2510100207834336   4.6285286e-06  2.246606
39636359106790862              262148   main    dark    36693  1.761109025999172   7.8319346e-05 1.7816185
39636377268126958              917542   main    dark    22637 1.7538339373329905   4.4817047e-05  1.770211
39636633238114290              262148   main    dark    31327   2.60179477218226  -0.00026447637 2.6271539
39636671964122208 4611686018427715589   main    dark    17870  1.075252463040938   0.00095967355 1.0764211
39637174110391678              262148   main    dark     4839  1.631923686948274   0.00025237695  1.607901
39637192221393962              262148   main    dark     8689 1.3465802158440203   -6.143585e-05 1.3637639
39637295552268374              262148   main    dark     2135 3.1030483781275437    -0.003112147 3.0987177
39637320621622267              262148   main    dark    13432 3.2368772231999463  -0.00018771099   3.28481
moustakas commented 1 year ago

OK, this is definitely a bug. What happens is these objects are never even read because Z_RR<1e-3, so they never have a chance to get their redshifts "corrected". I'm working on a fix now.

moustakas commented 1 year ago

Another issue is that I appear to be using incorrect redshifts for 3299 objects (see below, building on the code above). Drilling down into the first object, it appears that the logic I'm using (https://fastspecfit.readthedocs.io/en/latest/vacs.html#updated-qso-redshifts) may be incorrect / incomplete:

qn = Table(fitsio.read('/global/cfs/cdirs/desi/spectro/redux/iron/healpix/main/dark/361/36183/qso_qn-main-dark-36183.fits'))
qn[qn['TARGETID'] == 39627340543690901]['TARGETID', 'Z_NEW', 'Z_RR']
<Table length=1>
     TARGETID           Z_NEW          Z_RR
      int64            float64       float32
----------------- ----------------- ---------
39627340543690901 2.536549385403353 1.7805845

Full sample:

jj['HEALPIX'] = jj['HPXPIXEL']
out = join(tt, jj, keys=['TARGETID', 'SURVEY', 'PROGRAM', 'HEALPIX', 'DESI_TARGET'], table_names=['FAST', 'LSS'])
assert(len(out)+len(miss) == len(jj))

diff = out[out['Z_LSS'] != out['Z_FAST']]['TARGETID', 'SURVEY', 'PROGRAM', 'HEALPIX', 'Z_LSS', 'Z_FAST', 'DESI_TARGET']
diff
<Table length=3299>
<Table length=3299>
     TARGETID     SURVEY PROGRAM HEALPIX       Z_LSS               Z_FAST           DESI_TARGET
      int64        str4    str4   int64       float64             float64              int64
----------------- ------ ------- ------- ------------------ ------------------- -------------------
39627340543690901   main    dark   36183  2.536549385403353  1.7805844892920242              655394
39627357115385691   main    dark   36527  0.626857834582093  1.9446037297101326              655458
39627357140554446   main    dark   36530 1.1274715946805622  1.6260670972765672              655394
39627357719365101   main    dark   36212  2.370761483167071  1.6487141423026246              655394
39627362802867407   main    dark   17416 2.6943721468996578  1.9043881509138207              655394
39627362853199454   main    dark   36526 2.3463482953972963  3.2562304333552183 4611686018428043362
39627368582619242   main    dark   17411 1.9271584236034853  1.8342251149308004              655394
39627369064958461   main    dark   36276 2.1988898383021307   1.513625845008762              655458
39627374936982135   main    dark   36293  2.573733947715017   1.812128927914253              655394
39627375037649512   main    dark   22538  2.138624622149523  1.5021115150445838              655394
39627380087588624   main    dark   16757 0.5935763860655094   1.866448017977218              655394
39627386483904772   main    dark   36295 1.6466358174239513  1.6152247224385443              655394
39627392175576356   main    dark   36279 1.9377994162447914    5.63741184509524              655394
39627397821109996   main    dark   36640 2.2597093836114177  1.5630537386538588              655458
39627398114707701   main    dark   36219  5.126483126386998    5.17799143745699             1179778
39627398215371922   main    dark   22539 2.1836807758694303  1.5008758140050233              655458
39627403638607415   main    dark   36641 0.7637694268293987  2.1411360775987105              655458
              ...    ...     ...     ...                ...                 ...                 ...
39636353285097367   main    dark   22681 2.7229406323123695  1.9633068476745965              655394
39636437234093926   main    dark   22737 0.6943514171425593   2.017052092038529              655394
39636478124361246   main    dark   17385 2.2028826505364596  1.5506542021004148              655458
39636485284038684   main    dark   22766 2.9477605139352923   3.011673020867196              655394
39636515575305340   main    dark   22922 2.5643468362578252  0.9521238940731278              655394
39636521615101921   main    dark   22922 1.9610394046977955  2.0388997366605253              655394
39636557761614845   main    dark   23121  2.057927341422701  1.4696118000228529              655394
39636584995232869   main    dark   31145 3.2922356038522884  3.3097406415653707              655394
39636620919441149   main    dark   31410 0.7081089774723569  2.0155123524353673              655394
39636649352628424   main    dark   21959 1.6367039453254502   2.335067645429564              655394
39636671964124237   main    dark   17871   2.27499952781342   2.262812289029426              655394
39636755615320847   main    dark   19514 1.9733396641479715  1.9499487475460613              655394
39636755640488415   main    dark   19513 2.9232141318006715   2.086047172557037              655394
39636761525094755   main    dark   19594   1.43901641851695   2.040846745977564              655394
39636943272673627   main    dark    8491 2.2434052702293714  1.5843428896089298              655394
39636974255997521   main    dark   20019 2.4333791155817437  1.7228507451146609              655394
39637018577209037   main    dark    8589 1.7052459068768118 0.15112310050866345              655394
39637295594210896   main    dark     939 2.1980977735385685  1.5239411169369623              655458
moustakas commented 1 year ago

So it looks like none of these 3299 objects are QSO targets, which is why they are excluded. @stephjuneau can you comment on how I should be handling these objects?

from desitarget.targetmask import desi_mask
np.sum(desi_mask.QSO & diff['DESI_TARGET'] != 0)
  0

Looking at the QSO maker code (?), it looks like ELG and BGS targets are also allowed to have their redshifts updated by QuasarNet: https://github.com/desihub/LSS/blob/main/py/LSS/qso_cat_utils.py#L351-L362

If this is the code that I should be using it would be great if it was written in a more modular way somewhere.

moustakas commented 1 year ago

Here's that first object at the QuasarNet redshift. I guess it's not crazy that that's Lyman-alpha (note: 2-pixel smoothing on the left-hand side):

Screenshot 2023-08-14 at 10 07 21 PM
stephjuneau commented 1 year ago

Hi @moustakas

Looking at the QSO maker code (?), it looks like ELG and BGS targets are also allowed to have their redshifts updated by QuasarNet: https://github.com/desihub/LSS/blob/main/py/LSS/qso_cat_utils.py#L351-L362

If this is the code that I should be using it would be great if it was written in a more modular way somewhere.

Yes, this is the correct code. By modular, do you mean having a separate function as part of QSO-maker or elsewhere in the desi software stack? I guess I'd suggest applying these rules first before cutting out on redshift to remove possible stellar contaminants.

ashleyjross commented 1 year ago

A note on all of this, which might be helpful for context: We generally run on everything, so that we can catch QSO that were not targeted as QSO. The non-QSO target info is (so far, that I know of) only used by the GQP group (with lya showing some interest about anything high redshift). For the LSS catalogs, we only query the QSO info after we have already cut to only QSO targets. What is there for the other targets is, I think, still somewhat experimental (in terms of purity and redshift accuracy, at least last I knew, @stephjuneau should know better). Exactly what to do with it, I'm not sure. I don't believe we have an "official" answer for these within DESI at this point.

stephjuneau commented 1 year ago

@moustakas could you please send me the full sample of ~3k above? I tried to copy/paste from GitHub but it didn't expand the truncated part of the table.

moustakas commented 1 year ago

@stephjuneau here's the code (combined from a couple places above):

import numpy as np
import fitsio
from astropy.table import Table, join

tt = Table(fitsio.read('/global/cfs/cdirs/desi/spectro/fastspecfit/iron/catalogs/fastspec-iron-main-dark.fits', 'METADATA'))
jj = Table(fitsio.read('/global/cfs/cdirs/desi/survey/catalogs//Y1/QSO/iron/QSO_cat_iron_main_dark_healpix_v0.fits'))
jj['HEALPIX'] = jj['HPXPIXEL']
out = join(tt, jj, keys=['TARGETID', 'SURVEY', 'PROGRAM', 'HEALPIX', 'DESI_TARGET'], table_names=['FAST', 'LSS'])

diff = out[out['Z_LSS'] != out['Z_FAST']]['TARGETID', 'SURVEY', 'PROGRAM', 'HEALPIX', 'Z_LSS', 'Z_FAST', 'DESI_TARGET']
stephjuneau commented 1 year ago

@moustakas I don't have a complete analysis yet but some first findings:

I'm attaching redshift distributions for the two values. The bottom panels zooms in around 0<z<1 and from this, there are 37 galaxies at low-redshift (z<0.2) with ZWARN=0. So looking at those 37 could be useful to identify whether there is a specific failure mode. BUT this is a tiny number given that this is for all of Iron-main-dark.....

image
stephjuneau commented 1 year ago

Maybe a question for @dylanagreen : does QN have a lower redshift limit of z>0.5? Or would this be when RR is ran again with the QSO priors? It seems pretty abrupt on the redshift histograms above where Z_LSS comes from the LSS catalog for some QSOs identified with QN that were assigned a new redshift relative to RR.

stephjuneau commented 1 year ago

Here's the first z=0.2 case with Z_FAST=0.199 but clearly the Z_LSS from QN is much better at z=1.8 with high confidence (there's even a MgII absorption doublet on top of the broad MgII, which I guess was confused for Ha+[NII] by RR). I'm on Team QSO for this one.

image

The second example with Z_FAST=0.16 actually now has Z_PIPE from the Daily reduction in better agreement with Z_LSS so again a QSO (although MgII is in an overlap region, both CIII] and CIV seem credible): https://www.legacysurvey.org/viewer-desi/desi-spectrum/daily/targetid39627575147891193

stephjuneau commented 1 year ago

Here are more detailed notes on specific cases from my earlier VI of the first 25.

Table with new Daily redshifts (w.r.t. Iron):

     TARGETID     SURVEY PROGRAM HEALPIX       Z_LSS             Z_FAST       Z_DAILY COMMENT ON Z_DAILY
----------------- ------ ------- ------- ----------------- ------------------ ------- -------------------------------------------------
39627357719365101   main    dark   36212 2.370761483167071 1.6487141423026246 2.3699  agrees with Z_LSS (QN)
39627362853199454   main    dark   36526 2.346348295397296 3.2562304333552183 3.2204  disagress with both
39627374936982135   main    dark   36293 2.573733947715017 1.812128927914253  1.0847  disagress with both
39627375037649512   main    dark   22538 2.138624622149523 1.5021115150445838 2.1831  agrees with Z_LSS (QN)
39627397821109996   main    dark   36640 2.259709383611418 1.563053738653859  0.0077  will be lost with DAILY (assigning to GALAXY now; Z_LSS is correct for this LAE)
39627398114707701   main    dark   36219 5.126483126386998 5.17799143745699   5.0000  all are wrong; correct z=1.3076 ([OII] doublet)
39627398215371922   main    dark   22539 2.183680775869430 1.50087581400502   2.1840  agrees with Z_LSS (QN)
39627403638607415   main    dark   36641 0.763769426829399 2.1411360775987105 2.1431  agrees with Z_FAST (could be z=2.1337)
39636353285097367   main    dark   22681 2.722940632312369 1.9633068476745965 0.2331  all are wrong; LAE at z=2.77 (now DAILY assigning to GALAXY)
39636478124361246   main    dark   17385 2.202882650536460 1.5506542021004148 0.0020  Z_LSS is closer but both are wrong; DAILY says STAR but ZWARN=BAD_MINFIT; correct z=2.2455 (LAE)

Cases where both Z_LSS and Z_FAST/Z_DAILY seem wrong

Cases where Z_LSS is just slightly off in a puzzling manner

Beautiful examples where new redshift is important:

moustakas commented 1 year ago

@stephjuneau thank you for the detailed analysis.

I agree there are some improvements in the redshifts for a small number of non-QSO targets, but I would prefer to not bring in a dependency on the MgII after-burner fitting results. I personally think that all of these cases can be improved through better spectrophotometric templates.

Since I fixed the original bug (which was missing genuine QSOs and their updated QuasarNet redshifts), I would like to close this ticket. However, is there somewhere else we can move your VI work? Maybe https://github.com/desihub/desigal/issues?