Closed weiyuehuan closed 6 years ago
Nice work. Are there any performance plots on the wiki (not posting here)?
Performance on Optical MC: FANN has clear improvement with more PMT working on top array while the improvement is tiny for the Tensorflow Net. In the plots below, the error is the absolute distance between true and reconstructed position.
@tunnell Fullly agree! Can you help to move the maps to rundb instead? I don't have any knowledge about rundb....
@weiyuehuan It's clear that TensorFlow's performance is better than FANN, maybe you can send a email to analysis group to suggest a switch to it in post-SR1 analyses? Any opinions from @mcfatelin and @hasterok ?
I agree that TensorFlow looks better. I think we should try to use it for the post-SI analyses. @tunnell Do we have the resources to reprocess SR1 and SR0?
I also strongly agree to moving to TF NN just since it's more robust against overfitting. @hasterok You'd have to ask Boris and/or Evan, but the resources just have to be found if there is an analysis need.
@tunnell @hasterok I'm not sure I understand this correctly.... I thought the TFNN is in the hax level. For switching to TFNN, we need the correction maps (S1, S2 and FDC) from TFNN. @weiyuehuan Can you please confirm?
We definitely have the resources if we want to reprocess all of SR0+SR1. I would just ask that we make sure that any other changes like 2-fold coincidence that we might want are implemented and tested beforehand so we don't try to reprocess twice.
Edit: I also share same question as Qing, I was told before that gains are the only correction really used at pax level. I guess the NN is also?
@mcfatelin @hasterok @ershockley @tunnell No, the tensorflow related variables are already saved in posrec minitrees, and was ready even for SR1 analysis, including the corresponding FDC. We didn't use it because of the famous 'time pressure'. My suggestion was that the analysis coordinators shall encourage people to use it for post-SR1 analysis. Things dependent on reconstructed position are:
Sorry between this thread and the email exchange on reprocessing it's still not clear to me what the situation is.
So the tensorflow map is implemented at hax level but FANN is at pax level? Is this right?
How is this related to the gain model, and do we need to update the gain model first to then update FANN again? Or is it just that since we want to bump versions anyway we just want to update the gains at the same time.
I just want to also add that I will not process data using pax_head. It needs to be a tagged release, so this needs to be merged before proceeding.
@ershockley What you describe here is correct.
After merging this PR I will make a pax release of say pax_v6.9.1. @tunnell why is the automatic check stopped? @ershockley if we want to process data with pax_v6.9.1, we will meet the map name issue in rundb, do we have a solution yet?
Thanks @feigaodm. For the map name issue, I think it's only a problem if we want to continue processing with v6.8.0 while also (re)processing with this new release, which I don't see why we would? If we bump to a new version and then stop v6.8.0 we can just update all runs to use the new map, right?
@coderdj is probably the best expert on this. Don't we just have to update the corrections DB or something? Then cax should update runsDB for us.
I would suggest to implement FANN in hax at the same level as TFNN. This would just be an interim solution until all this can be fully/properly refactored (@tunnell?), but at least should remove the dependence on RunsDB and can easily define the map run-dependence like TFNN. Then don't need to reprocess if just changing NNs (any new processing will just use old dummy NN). Also, in principle, if the NN method hasn't changed and only the inputs changed, then it shouldn't require a new pax version, i.e. don't need to reprocess old runs with the same map, and only need to correct new runs that used the wrong map.
I believe the gain model is stored in the CorrectionsDB then propagated to RunsDB, which then pax draws from (@tunnell, @skazama?). So version control is not super strict, just a number inside the pax_metadata
: "correction_versions": {"AddGains": "4.0"}
. So indeed we need to be careful here, i.e. all processing should stop before @skazama updates the gain model, then we make a new pax release (using pax version to control gain version), then proceed with reprocessing. In this case we would need a new pax version, since I think the gain model update will affect older runs (maybe even those in SR1 paper, @skazama?)
In summary, bumping the pax version may create some duplicate files of old runs that have no change, but seems safer/easier for bookkeeping overall.
Just to confirm, @skazama @weiyuehuan: these new NN maps shall only be applied to runs >= 18836?
Then, @weiyuehuan should the filename be instead XENON1T_NN_v8_mc_v030_20180613_postSR1.npz
? or does v9
imply something changed in the NN method?
Regarding the label postSR1
, @hasterok @mcfatelin: shall we try to be more precise on naming of various run ranges? For example, SR1b
for data after SI paper dataset runs >= 16640, SR2
for runs >= 18836, etc., considering all significant detector condition changes that can lead to e.g. new cut tunings in lax (not sure postsr1.py will be sufficient, @JelleAalbers @sreichard)?
I agree putting run number, to which the map is applicable, into the filename. But when generating minitreess, I guess we are still manually selecting the map for different runs, right?
Please see e.g. https://github.com/XENON1T/hax/pull/233 for hax treatment.
Neural Net after SR1 and PMT change: https://xe1t-wiki.lngs.infn.it/doku.php?id=xenon:xenon1t:ops:post_sr1_daq_operation#change_of_pmt_status_after_sr1
Only Top PMT Array are used for NN Training. Compare to SR1, the new Net is based on PMT condition: SR1 + PMT(12, 27, 34, 73, 86) - PMT87
FANN: XENON1T_NN_v9_mc_v030_20180613_postSR1.npz
TensorFlow: XENON1T_tensorflow_nn_pos_20180613_postsr1.json XENON1T_tensorflow_nn_pos_weights_20180613_postsr1.h5
Both FANN and TensorFlow Net are updated!