XENON1T / pax

The XENON1T raw data processor [deprecated]
BSD 3-Clause "New" or "Revised" License
16 stars 15 forks source link

upload new NN net after SR1 #709

Closed weiyuehuan closed 6 years ago

weiyuehuan commented 6 years ago

Neural Net after SR1 and PMT change: https://xe1t-wiki.lngs.infn.it/doku.php?id=xenon:xenon1t:ops:post_sr1_daq_operation#change_of_pmt_status_after_sr1

Only Top PMT Array are used for NN Training. Compare to SR1, the new Net is based on PMT condition: SR1 + PMT(12, 27, 34, 73, 86) - PMT87

FANN: XENON1T_NN_v9_mc_v030_20180613_postSR1.npz

TensorFlow: XENON1T_tensorflow_nn_pos_20180613_postsr1.json XENON1T_tensorflow_nn_pos_weights_20180613_postsr1.h5

Both FANN and TensorFlow Net are updated!

tunnell commented 6 years ago

Nice work. Are there any performance plots on the wiki (not posting here)?

weiyuehuan commented 6 years ago

Performance on Optical MC: FANN has clear improvement with more PMT working on top array while the improvement is tiny for the Tensorflow Net. In the plots below, the error is the absolute distance between true and reconstructed position.

postsr1_1

postsr1_2

postsr1_3

feigaodm commented 6 years ago

@tunnell Fullly agree! Can you help to move the maps to rundb instead? I don't have any knowledge about rundb....

feigaodm commented 6 years ago

@weiyuehuan It's clear that TensorFlow's performance is better than FANN, maybe you can send a email to analysis group to suggest a switch to it in post-SR1 analyses? Any opinions from @mcfatelin and @hasterok ?

hasterok commented 6 years ago

I agree that TensorFlow looks better. I think we should try to use it for the post-SI analyses. @tunnell Do we have the resources to reprocess SR1 and SR0?

tunnell commented 6 years ago

I also strongly agree to moving to TF NN just since it's more robust against overfitting. @hasterok You'd have to ask Boris and/or Evan, but the resources just have to be found if there is an analysis need.

mcfatelin commented 6 years ago

@tunnell @hasterok I'm not sure I understand this correctly.... I thought the TFNN is in the hax level. For switching to TFNN, we need the correction maps (S1, S2 and FDC) from TFNN. @weiyuehuan Can you please confirm?

ershockley commented 6 years ago

We definitely have the resources if we want to reprocess all of SR0+SR1. I would just ask that we make sure that any other changes like 2-fold coincidence that we might want are implemented and tested beforehand so we don't try to reprocess twice.

Edit: I also share same question as Qing, I was told before that gains are the only correction really used at pax level. I guess the NN is also?

feigaodm commented 6 years ago

@mcfatelin @hasterok @ershockley @tunnell No, the tensorflow related variables are already saved in posrec minitrees, and was ready even for SR1 analysis, including the corresponding FDC. We didn't use it because of the famous 'time pressure'. My suggestion was that the analysis coordinators shall encourage people to use it for post-SR1 analysis. Things dependent on reconstructed position are:

ershockley commented 6 years ago

Sorry between this thread and the email exchange on reprocessing it's still not clear to me what the situation is.

I just want to also add that I will not process data using pax_head. It needs to be a tagged release, so this needs to be merged before proceeding.

feigaodm commented 6 years ago

@ershockley What you describe here is correct.

After merging this PR I will make a pax release of say pax_v6.9.1. @tunnell why is the automatic check stopped? @ershockley if we want to process data with pax_v6.9.1, we will meet the map name issue in rundb, do we have a solution yet?

ershockley commented 6 years ago

Thanks @feigaodm. For the map name issue, I think it's only a problem if we want to continue processing with v6.8.0 while also (re)processing with this new release, which I don't see why we would? If we bump to a new version and then stop v6.8.0 we can just update all runs to use the new map, right?

@coderdj is probably the best expert on this. Don't we just have to update the corrections DB or something? Then cax should update runsDB for us.

pdeperio commented 6 years ago

In summary, bumping the pax version may create some duplicate files of old runs that have no change, but seems safer/easier for bookkeeping overall.

pdeperio commented 6 years ago

Just to confirm, @skazama @weiyuehuan: these new NN maps shall only be applied to runs >= 18836?

Then, @weiyuehuan should the filename be instead XENON1T_NN_v8_mc_v030_20180613_postSR1.npz? or does v9 imply something changed in the NN method?

Regarding the label postSR1, @hasterok @mcfatelin: shall we try to be more precise on naming of various run ranges? For example, SR1b for data after SI paper dataset runs >= 16640, SR2 for runs >= 18836, etc., considering all significant detector condition changes that can lead to e.g. new cut tunings in lax (not sure postsr1.py will be sufficient, @JelleAalbers @sreichard)?

mcfatelin commented 6 years ago

I agree putting run number, to which the map is applicable, into the filename. But when generating minitreess, I guess we are still manually selecting the map for different runs, right?

pdeperio commented 6 years ago

Please see e.g. https://github.com/XENON1T/hax/pull/233 for hax treatment.