desihub / redrock

Redshift fitting for spectroperfectionism
BSD 3-Clause "New" or "Revised" License
22 stars 14 forks source link

ZWARN bit 2 DELTACHI2 set for LOZ/HIZ overlap #312

Closed sbailey closed 1 month ago

sbailey commented 3 months ago

Jura QSO redshifts have an excess of ZWARN=4 (bit 2 DELTACHI2) flags for z=1.4 to 1.6, which @stephjuneau correctly guessed was due to the overlap between QSO LOZ and HIZ subtypes. Templates with the same SPECTYPE but different SUBTYPE should not trigger a ZWARN DELTACHI2 flag if they have the ~same redshift. However, that "same redshift" window was developed on SPECTYPE=STAR templates and might be too tight for QSOs, leading to this excess.

image

abrodze commented 2 months ago

I identified two goals of the changes made for #300:

(1) When scanning the chi2 surface from a given template type, the minimum offset in velocity space max_velo_diff between two minima to consider both as "good solutions" should be smaller for STAR types (currently set to 100 km/s) than GALAXY/QSO (currently 1000 km/s). (2) When ranking the N best fits across all template types, valid solutions with deltachi2<40 to another valid solution should not trigger ZWARN.SMALL_DELTA_CHI2 if the velocity offset between the redshifts is less than the value of max_velo_diff, i.e. (|z_1 - z_2| < max_velo_diff) -> no zwarn. This allows STAR/QSO subtypes to have the "same" redshift solution without triggering a warning (bonus: spectra fit equally well by GALAXY and QSO at the same redshift will also not be flagged -- good for our boundary cases). The value of max_velo_diff for which two STAR fits qualify as the "same redshift" should be stricter (100 km/s) than when comparing GALAXY/QSO to GALAXY/QSO/STAR (1000 km/s).

(1) seems to work fine, (2) imposes max_velo_diff = 100 km/s for all comparisons (not just STAR-STAR) owing to its placement in zfind.py. ~80% of SMALL_DELTACHI2 for QSO best fits 1.4<z<1.6 have a second best fit from the other QSO subtype w/i 500 km/s

PR #315 (work in progress) attempts to fix the issue with (2)

abrodze commented 2 months ago

Testing velodiff on the bright and dark time spectra from 5 random healpix yields the follow ZWARN changes from jura->velodiff (note SPECTYPE/SUBTYPE/Z unchanged for N best solutions from jura->velodiff, as expected):

TARGETID BEST SPEC/SUB BEST Z 2nd SPEC/SUB 2nd Z ZWARN Jura ZWARN velodiff
39627918141297673 GALAXY -0.00038110198879575535 STAR-G -0.00043842250761299596 4 0
39627918145490404 QSO-LOZ 1.570988329023113 QSO-HIZ 1.5722801626648066 4 0
39628534909504805 STAR-K -0.0008686975090943437 GALAXY -0.0002122427337868294 4 0
39628295049841654 QSO-LOZ 1.5423072585152233 QSO-HIZ 1.5355944974929774 4 0
39628289404308576 QSO-HIZ 1.550410268576145 QSO-LOZ 1.5516679257428017 4 0
39628289412697609 QSO-LOZ 1.4023193160330976 QSO-HIZ 1.4083295493824548 4 0
39628352079794080 QSO-LOZ 1.5791245239679095 QSO-HIZ 1.5782587641246637 4 0
39628439069657952 QSO-LOZ 1.492384745922154 QSO-HIZ 1.491430594388815 4 0

Lingering thoughts: I think we actually want max_velo_diff = 100 km/s if one type in the comparison is STAR, not only when both are STAR. See the first and third rows of the table, for example. I would consider these cases of genuine confusion that we do want to be flagged as unreliable fits. Are there any counterpoints to this?

akremin commented 1 month ago

I'm just coming up to speed here so please let me know if I'm thinking about this too simplistically: I think there is a genuine difference in the precision we can expect from Galaxy/QSO template fits compared to STAR fits. One could imagine for these hybrid cases using a third threshold. Combining the two in quadrature gave me a back of the envelope of ~711km/s, which we could round to 700km/s as a threshold for these. This accounts for the limited precision of the extragalactic fit and the higher precision of the stellar fit. If, however, you're empirically seeing a clear bimodal distribution with a clean separation beyond 100km/s then I could be convinced to use the more stringent threshold.

akremin commented 1 month ago

After discussion on the data telecon and with Allyson privately it seems like any choice here is fine. We are only discussing 2 objects out of 11,000. In addition, these objects are ones with galaxy and stellar redshifts very close to one another, so are either very low redshift or of genuine confusion in redrock.

I recommend taking the min() of the two velo_diff values. The implication of this is that more of these confusion cases between galaxies/qso's and stars will set the ZWARN bit=2. But "more" in this case is still a very small number (2/11000 in Allyson's test).

akremin commented 1 month ago

Closed with redrock PR #317

abrodze commented 1 month ago

Testing on K1 for a secondary confirmation that the issue is fixed:

Image Image

3.4% of QSO SPECTYPE flagged ZWARN=4 all z (4.3% with Jura) 5.6% of QSO SPECTYPE flagged ZWARN=4 for 1.4<z<6 (13.6% with Jura)