ZWARN bit 2 DELTACHI2 set for LOZ/HIZ overlap

sbailey commented 3 months ago

Jura QSO redshifts have an excess of ZWARN=4 (bit 2 DELTACHI2) flags for z=1.4 to 1.6, which @stephjuneau correctly guessed was due to the overlap between QSO LOZ and HIZ subtypes. Templates with the same SPECTYPE but different SUBTYPE should not trigger a ZWARN DELTACHI2 flag if they have the ~same redshift. However, that "same redshift" window was developed on SPECTYPE=STAR templates and might be too tight for QSOs, leading to this excess.

abrodze commented 2 months ago

I identified two goals of the changes made for #300:

(1) When scanning the chi2 surface from a given template type, the minimum offset in velocity space max_velo_diff between two minima to consider both as "good solutions" should be smaller for STAR types (currently set to 100 km/s) than GALAXY/QSO (currently 1000 km/s). (2) When ranking the N best fits across all template types, valid solutions with deltachi2<40 to another valid solution should not trigger ZWARN.SMALL_DELTA_CHI2 if the velocity offset between the redshifts is less than the value of max_velo_diff, i.e. (|z_1 - z_2| < max_velo_diff) -> no zwarn. This allows STAR/QSO subtypes to have the "same" redshift solution without triggering a warning (bonus: spectra fit equally well by GALAXY and QSO at the same redshift will also not be flagged -- good for our boundary cases). The value of max_velo_diff for which two STAR fits qualify as the "same redshift" should be stricter (100 km/s) than when comparing GALAXY/QSO to GALAXY/QSO/STAR (1000 km/s).

(1) seems to work fine, (2) imposes max_velo_diff = 100 km/s for all comparisons (not just STAR-STAR) owing to its placement in zfind.py. ~80% of SMALL_DELTACHI2 for QSO best fits 1.4<z<1.6 have a second best fit from the other QSO subtype w/i 500 km/s

PR #315 (work in progress) attempts to fix the issue with (2)

abrodze commented 2 months ago

Testing velodiff on the bright and dark time spectra from 5 random healpix yields the follow ZWARN changes from jura->velodiff (note SPECTYPE/SUBTYPE/Z unchanged for N best solutions from jura->velodiff, as expected):

TARGETID	BEST SPEC/SUB	BEST Z	2nd SPEC/SUB	2nd Z	ZWARN Jura
39627918141297673	GALAXY	-0.00038110198879575535	STAR-G	-0.00043842250761299596	4
39627918145490404	QSO-LOZ	1.570988329023113	QSO-HIZ	1.5722801626648066	4
39628534909504805	STAR-K	-0.0008686975090943437	GALAXY	-0.0002122427337868294	4
39628295049841654	QSO-LOZ	1.5423072585152233	QSO-HIZ	1.5355944974929774	4
39628289404308576	QSO-HIZ	1.550410268576145	QSO-LOZ	1.5516679257428017	4
39628289412697609	QSO-LOZ	1.4023193160330976	QSO-HIZ	1.4083295493824548	4
39628352079794080	QSO-LOZ	1.5791245239679095	QSO-HIZ	1.5782587641246637	4
39628439069657952	QSO-LOZ	1.492384745922154	QSO-HIZ	1.491430594388815	4

Lingering thoughts: I think we actually want max_velo_diff = 100 km/s if one type in the comparison is STAR, not only when both are STAR. See the first and third rows of the table, for example. I would consider these cases of genuine confusion that we do want to be flagged as unreliable fits. Are there any counterpoints to this?

akremin commented 1 month ago

I'm just coming up to speed here so please let me know if I'm thinking about this too simplistically: I think there is a genuine difference in the precision we can expect from Galaxy/QSO template fits compared to STAR fits. One could imagine for these hybrid cases using a third threshold. Combining the two in quadrature gave me a back of the envelope of ~711km/s, which we could round to 700km/s as a threshold for these. This accounts for the limited precision of the extragalactic fit and the higher precision of the stellar fit. If, however, you're empirically seeing a clear bimodal distribution with a clean separation beyond 100km/s then I could be convinced to use the more stringent threshold.

akremin commented 1 month ago

After discussion on the data telecon and with Allyson privately it seems like any choice here is fine. We are only discussing 2 objects out of 11,000. In addition, these objects are ones with galaxy and stellar redshifts very close to one another, so are either very low redshift or of genuine confusion in redrock.

I recommend taking the min() of the two velo_diff values. The implication of this is that more of these confusion cases between galaxies/qso's and stars will set the ZWARN bit=2. But "more" in this case is still a very small number (2/11000 in Allyson's test).

akremin commented 1 month ago

Closed with redrock PR #317

abrodze commented 1 month ago

Testing on K1 for a secondary confirmation that the issue is fixed:

3.4% of QSO SPECTYPE flagged ZWARN=4 all z (4.3% with Jura) 5.6% of QSO SPECTYPE flagged ZWARN=4 for 1.4<z<6 (13.6% with Jura)

desihub / redrock

ZWARN bit 2 DELTACHI2 set for LOZ/HIZ overlap #312