upload new NN net after SR1

weiyuehuan commented 6 years ago

Neural Net after SR1 and PMT change: https://xe1t-wiki.lngs.infn.it/doku.php?id=xenon:xenon1t:ops:post_sr1_daq_operation#change_of_pmt_status_after_sr1

Only Top PMT Array are used for NN Training. Compare to SR1, the new Net is based on PMT condition: SR1 + PMT(12, 27, 34, 73, 86) - PMT87

FANN: XENON1T_NN_v9_mc_v030_20180613_postSR1.npz

TensorFlow: XENON1T_tensorflow_nn_pos_20180613_postsr1.json XENON1T_tensorflow_nn_pos_weights_20180613_postsr1.h5

Both FANN and TensorFlow Net are updated!

tunnell commented 6 years ago

Nice work. Are there any performance plots on the wiki (not posting here)?

weiyuehuan commented 6 years ago

Performance on Optical MC: FANN has clear improvement with more PMT working on top array while the improvement is tiny for the Tensorflow Net. In the plots below, the error is the absolute distance between true and reconstructed position.

postsr1_1

postsr1_2

postsr1_3

feigaodm commented 6 years ago

@tunnell Fullly agree! Can you help to move the maps to rundb instead? I don't have any knowledge about rundb....

feigaodm commented 6 years ago

@weiyuehuan It's clear that TensorFlow's performance is better than FANN, maybe you can send a email to analysis group to suggest a switch to it in post-SR1 analyses? Any opinions from @mcfatelin and @hasterok ?

hasterok commented 6 years ago

I agree that TensorFlow looks better. I think we should try to use it for the post-SI analyses. @tunnell Do we have the resources to reprocess SR1 and SR0?

tunnell commented 6 years ago

I also strongly agree to moving to TF NN just since it's more robust against overfitting. @hasterok You'd have to ask Boris and/or Evan, but the resources just have to be found if there is an analysis need.

mcfatelin commented 6 years ago

@tunnell @hasterok I'm not sure I understand this correctly.... I thought the TFNN is in the hax level. For switching to TFNN, we need the correction maps (S1, S2 and FDC) from TFNN. @weiyuehuan Can you please confirm?

ershockley commented 6 years ago

We definitely have the resources if we want to reprocess all of SR0+SR1. I would just ask that we make sure that any other changes like 2-fold coincidence that we might want are implemented and tested beforehand so we don't try to reprocess twice.

Edit: I also share same question as Qing, I was told before that gains are the only correction really used at pax level. I guess the NN is also?

feigaodm commented 6 years ago

@mcfatelin @hasterok @ershockley @tunnell No, the tensorflow related variables are already saved in posrec minitrees, and was ready even for SR1 analysis, including the corresponding FDC. We didn't use it because of the famous 'time pressure'. My suggestion was that the analysis coordinators shall encourage people to use it for post-SR1 analysis. Things dependent on reconstructed position are:

FV cut of course
Correction maps. S1 correction maps shall be similar for each one. However, for S2 maps it might be different. The FANN has some bias while TFNN doesn't. In this sense, I would suggest to recreate the S2 correction map at least.
Some cut variables like S1_pattern_likelihood value. But we can inform the analyzers and the cut developer shall be able to identify it immediately.

ershockley commented 6 years ago

Sorry between this thread and the email exchange on reprocessing it's still not clear to me what the situation is.

So the tensorflow map is implemented at hax level but FANN is at pax level? Is this right?
How is this related to the gain model, and do we need to update the gain model first to then update FANN again? Or is it just that since we want to bump versions anyway we just want to update the gains at the same time.

I just want to also add that I will not process data using pax_head. It needs to be a tagged release, so this needs to be merged before proceeding.

feigaodm commented 6 years ago

@ershockley What you describe here is correct.

We FANN is calculated in pax level, but we can move it to hax. It requires some checks because some variables are dependent on it, but might worth the effort since we don't want to reprocess raw data only because of it.
The gain is another thing, I don't think we have a treemaker to deal with gain model yet, but I think it's doable according to conversation with @skazama during RPI meeting.

After merging this PR I will make a pax release of say pax_v6.9.1. @tunnell why is the automatic check stopped? @ershockley if we want to process data with pax_v6.9.1, we will meet the map name issue in rundb, do we have a solution yet?

ershockley commented 6 years ago

Thanks @feigaodm. For the map name issue, I think it's only a problem if we want to continue processing with v6.8.0 while also (re)processing with this new release, which I don't see why we would? If we bump to a new version and then stop v6.8.0 we can just update all runs to use the new map, right?

@coderdj is probably the best expert on this. Don't we just have to update the corrections DB or something? Then cax should update runsDB for us.

pdeperio commented 6 years ago

I would suggest to implement FANN in hax at the same level as TFNN. This would just be an interim solution until all this can be fully/properly refactored (@tunnell?), but at least should remove the dependence on RunsDB and can easily define the map run-dependence like TFNN. Then don't need to reprocess if just changing NNs (any new processing will just use old dummy NN). Also, in principle, if the NN method hasn't changed and only the inputs changed, then it shouldn't require a new pax version, i.e. don't need to reprocess old runs with the same map, and only need to correct new runs that used the wrong map.
I believe the gain model is stored in the CorrectionsDB then propagated to RunsDB, which then pax draws from (@tunnell, @skazama?). So version control is not super strict, just a number inside the pax_metadata: "correction_versions": {"AddGains": "4.0"}. So indeed we need to be careful here, i.e. all processing should stop before @skazama updates the gain model, then we make a new pax release (using pax version to control gain version), then proceed with reprocessing. In this case we would need a new pax version, since I think the gain model update will affect older runs (maybe even those in SR1 paper, @skazama?)

In summary, bumping the pax version may create some duplicate files of old runs that have no change, but seems safer/easier for bookkeeping overall.

pdeperio commented 6 years ago

Just to confirm, @skazama @weiyuehuan: these new NN maps shall only be applied to runs >= 18836?

Then, @weiyuehuan should the filename be instead XENON1T_NN_v8_mc_v030_20180613_postSR1.npz? or does v9 imply something changed in the NN method?

Regarding the label postSR1, @hasterok @mcfatelin: shall we try to be more precise on naming of various run ranges? For example, SR1b for data after SI paper dataset runs >= 16640, SR2 for runs >= 18836, etc., considering all significant detector condition changes that can lead to e.g. new cut tunings in lax (not sure postsr1.py will be sufficient, @JelleAalbers @sreichard)?

mcfatelin commented 6 years ago

I agree putting run number, to which the map is applicable, into the filename. But when generating minitreess, I guess we are still manually selecting the map for different runs, right?

pdeperio commented 6 years ago

Please see e.g. https://github.com/XENON1T/hax/pull/233 for hax treatment.

XENON1T / pax

upload new NN net after SR1 #709