desihub / QuasarNP

Pure Numpy implementation of QuasarNet
MIT License
0 stars 0 forks source link

Retrain QuasarNet with DESI data, optionally adjusting network structure #8

Closed sbailey closed 3 months ago

sbailey commented 4 months ago

Retrain QuasarNET with DESI data with the intension of improving purity and completeness on DESI data compared to the original version trained on eBOSS data.

Optionally update network structure (e.g. adding dropouts, more layers) if that helps.

dylanagreen commented 3 months ago

A summary of the work on this. In the past 3 weeks I have created a new version of the QuasarNET weights with the following recipe:

Here's the N(z) of the potential training data, in blue the cross matched catalog, in orange the "smoothed" catalog we pick spectra from. nz

For validation we did 3 tests:

  1. Validation on real data and the validation set (me)
  2. LyA simulations, $z \gtrsim 2$ (Julien Guy)
  3. 1x depth (Christophe)

Validation on Real Data & Validation Set

The new network shows "wiggles" in the N(z) distribution comparable to that of the previous eBOSS trained weights: nz

In terms of validation results, the new network is more pure/complete at the same cutoffs as used on the previous weights file: roc_purity

And it shows considerably less scatter/bias in the redshift estimates of this dataset:

validation_efficiency validation_single_histogram

LyA Simulations

For validation we ran the old weights, the jura weights (the ones that were worse) and these weights (v1.3

Per Julien: "It's visibly more precise (less scatter), no visible steps in QN redshifts, overall better purity with 0.05% outliers in the final catalog instead of 0.12% with Y1 QN. So as far as we can tell with the sims, this is the best set of QN weights. " results_v1_3

results_y1 results_jura

1x Depth (For tuning confidence cuts)

We ran Jura 1x Depth with the new v1.3 weights files and collated the results for Christophe. Per Christophe, quoted from an email to desi-data:

The truthmap is obtained with SV truth table based on an extended target selection ( i. e. SV selection, note that the main selection is included in the SV selection). In the final catalog of QSOs, we combine Redrock results and afterburner results (including QuasarNet).

If we keep the method for producing the catalog as it is, we obtain:

################### Catalog with old QuasarNet (jura1x_oldqn)

SV QSO TS Purity: 0.991 Efficiency: 0.850

Main QSO TS Purity: 0.996 Efficiency: 0.893

Catalog with new QuasarNet (jura1x_newqn)

SV QSO TS Purity: 0.973 Efficiency: 0.913

Main QSO TS Purity: 0.991 Efficiency: 0.947

We cannot conclude with this test. Therefore, we have two options, either relax the old QN or tighten up the new QSO to compare the two selections, in the same regime.

First Test

To be able to compare the two CNNs, I have relaxed the cut on the old QN, to get the same efficiency ~0.95 for the main QSO TS. In practice, CL_QN_old: CL>0.95 -> CL>0.10

Catalog with old QuasarNet (jura1x_oldqn)

SV QSO TS Purity: 0.931 Efficiency: 0.913

Main QSO TS Purity: 0.982 Efficiency: 0.943

With the same efficiency than the new QNN, the old QNN is more contaminated.

Second Test

To be able to compare the two CNNs, I have tightened up the cut on the new QN, to get the same purity ~0.996 for the main QSO TS. In practice, CL_QN_new: CL>0.95 -> CL>0.999

Catalog with new QuasarNet (jura1x_newqn)

SV QSO TS Purity: 0.992 Efficiency: 0.866

Main QSO TS Purity: 0.996 Efficiency: 0.914

With the same purity than the old QNN, the new QNN is more efficient (more complete).

########### The new QN wins both tests! I would recommend using the new QN with a tighter cut. For instance, CL>0.99 (instead of CL>0.95) which gives:

Catalog with new QuasarNet (jura1x_newqn)

SV QSO TS Purity: 0.986 Efficiency: 0.898

Main QSO TS Purity: 0.995 Efficiency: 0.937

For the critical region z~3.75, we need more statistics to conclude. This fine tuning can be done after Kibo processing.

I understood Stephen's question to be whether we saw a problem with the new QN. The answer is with the statistics available with the 1x_deep truth table, the new QN is better. and it can be used for Kibo production.

Conclusions

In all validation tests it seems the new weights file exceeds the performance of the old, including when combined with the redrock reruns in the full pipeline. The only extent question is QuasarNET results around the overlap region, z~3.75. In this region QuasarNET was originally used to remove Quasars from an overdensity generated by redrock (the reason for this overdensity is seemingly understood to be camera offset errors). For Kibo I believe we have recommended a change (that was implemented recently) to use a ZWARN and TARGET cut in that region instead, since redrock correctly flags most of these spurious identifications as probably being inaccurate. We will still need to tune the cutoff used for QuasarNET in this region, as Christophe indicated, if we want to combine that with QuasarNET information. Otherwise I see no reason not to proceed forward with this weights file.

dkirkby commented 3 months ago

Thanks for this detailed summary @dylanagreen!

sbailey commented 3 months ago

Looks good, thanks for collecting the validation plots and comments here. I have put this into /global/cfs/cdirs/desi/target/catalogs/lya/qn_models/qn_desi+eboss_4layers_log_grid_v1.3.h5 for use with Kibo. Closing ticket as done.