DUNE / larnd-sim

Simulation framework for a pixelated Liquid Argon TPC
Apache License 2.0
10 stars 26 forks source link

In compatibility of the light backtracking segment info and event / trigger info #205

Open YifanC opened 5 months ago

YifanC commented 5 months ago

Currently the light backtracking information is stored in [('trigger_id', '<i4'), ('op_channel_id', '<i4'), ('tick', '<i4'), ('event_id', '<i4'), ('segment_id', '<i8'), ('pe_current', '<f8')]. The change is made in this PR.

Quinn @qhvuong noticed that a single segment can correspond to multiple triggers in the current output (985413e). Tested on MiniRun5_1E19_RHC.convert2h5.00666.EDEPSIM.hdf5. See the snapshot of some examples:

segment_id:  20 ; trigger_id:  [0 4] ; event_id: [666000 666004]
segment_id:  52 ; trigger_id:  [0 1 4] ; event_id: [666000 666001 666004]
segment_id:  385 ; trigger_id:  [ 6 11 12] ; event_id: [666006 666012 666013]
segment_id:  720 ; trigger_id:  [ 6  8 18] ; event_id: [666006 666009 666019]
segment_id:  4507 ; trigger_id:  [32 55 75] ; event_id: [666033 666056 666077]

The rate of such occurrence is very high. The true event_id correspond to a segment is not necessary in the list of event_ids. For example segment 4507 is actually from event 666019.

Tag @marjoleinvannuland @russellphysics @qhvuong @krwood @mjkramer to follow up on this issue. For the moment, we plan to carry on MR5 production, and implement the fix for this issue in MR5.1.

russellphysics commented 5 months ago

Regarding trigger ID, inconsistent call to export_light_wvfm_to_hdf5 between here (correct) and here (bug)

YifanC commented 5 months ago

@russellphysics Could you elaborate please?

russellphysics commented 5 months ago

"i_trig" not provided in the function call at the last link

YifanC commented 5 months ago

Ah! Yes! You are absolutely right. Testing now.

russellphysics commented 5 months ago

I don't fully grasp the difference and relationship between event ID and light trigger ID; regardless, the hard coding of event_id[0] is likeley suspect

EDIT: ignore this comment. This assumption is ok... based on this

YifanC commented 5 months ago

While implementing the fix Brooke pointed out, I realised it shouldn't affect the current output... as the error is in a chunk that uses threshold trigger.

marjoleinvannuland commented 5 months ago

Manual check of the input to the export_to_hdf5 function in simulate_pixels.py (for all four modules, no segments found that are in more than module/event_id/trigger_id). So, the input appears to be as expected.

Example output used for checking:

> Simulating batches...:   4%|▋                   | 7/191 [00:07<03:24,  1.11s/it]
> input event id and trigger:  [8] 7
> true segments: [478 475 478 475 478 475 478 475 478 475 478 475 478 475 478 475 478 475
>  478 475 478 478 475 478 475 478 475 478 475 478 475 478 475 478 475 478
>  475 478]

Print statements were inserted in simulate_pixels.py before:

                light_sim.export_light_wvfm_to_hdf5(results['light_event_id'],
                                                    results['light_waveforms'],
                                                    output_filename,
                                                    results['light_waveforms_true_track_id'],
                                                    results['light_waveforms_true_photons'],
                                                    i_trig,
                                                    i_mod)

I printed results['light_event_id'], i_trig and results['light_waveforms_true_track_id']. I also tested the np.ndenumerate line in the zero_suppress function. This appears to work as expected. The indexing is also done correctly I believe

marjoleinvannuland commented 5 months ago

If mod2mod variations are turned off, the issue does not appear

marjoleinvannuland commented 5 months ago

Maybe related, maybe a separate issue, but only half of the op_channel_ids seem to be there. np.unique(test_file['light_wvfm_mc_assn']['op_channel_id']) = array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335], dtype=int32)

marjoleinvannuland commented 5 months ago

Are we missing an indent in this line (and the following ones?) https://github.com/DUNE/larnd-sim/blob/785fbbb008ae9f76602c25706cb3267036ea4bd2/larndsim/light_sim.py#L515 And should i_det be i_det_signal here https://github.com/DUNE/larnd-sim/blob/785fbbb008ae9f76602c25706cb3267036ea4bd2/larndsim/light_sim.py#L537

krwood commented 5 months ago

The op_channel_id array actually looks correct: each ADC unit per TPC has 96 channels; however only 48 of them are actually required for the 2x2. We have some spare/extra channels

marjoleinvannuland commented 5 months ago

The op_channel_id array actually looks correct: each ADC unit per TPC has 96 channels; however only 48 of them are actually required for the 2x2. We have some spare/extra channels

I don't understand? 48 channels per tpc (because twice 2 arclights with 6 sipms and 6 lcms with 2 sipms) would be 96 per module, so 384 in total right? I only see half of that. It shows 0-47 and then 96-143 etc.

krwood commented 5 months ago

You don't understand because I'm saying the wrong thing!

Each ADC unit per module has 96 channels; however, only 48 of them are actually required for the 2x2.

Each TPC has 4 light collection module with 6 sipms each --> 24 channels per TPC --> 48 channels per module. But the hardware has enough for 96 channels each.

krwood commented 5 months ago

@marjoleinvannuland I'm completely screwing this up... sorry for the confusion. You are correct, we should indeed have 384 active channels.

YifanC commented 5 months ago

@marjoleinvannuland for your test file where you printed np.unique(test_file['light_wvfm_mc_assn']['op_channel_id']), how many events are you testing? For a small number of events, it's possible that not all optical channels are listed here.

marjoleinvannuland commented 5 months ago

I checked an entire file. Even if not all channels are there, I do not expect exactly half of them to be missing. But if I add the indent, I get all the channels back. Also less segments seem to have multiple event_ids. However, some of them still have. So, it does not fix everything.

Are we missing an indent in this line (and the following ones?)

https://github.com/DUNE/larnd-sim/blob/785fbbb008ae9f76602c25706cb3267036ea4bd2/larndsim/light_sim.py#L515

And should i_det be i_det_signal here

https://github.com/DUNE/larnd-sim/blob/785fbbb008ae9f76602c25706cb3267036ea4bd2/larndsim/light_sim.py#L537

krwood commented 4 months ago

@marjoleinvannuland Should we get this fix into develop? We noticed some undesirable features in the MiniRun5_beta1 files that I suspect are connected to this. E.g. see page 13 of https://portal.nersc.gov/project/dune/data/2x2/simulation/productions/MiniRun5_1E19_RHC/MiniRun5_1E19_RHC.plots.beta1/PLOTS/LARNDSIM/0000000/MiniRun5_1E19_RHC.larnd.0000012.LARNDSIM_validations.pdf, also included here for convenience:

Screen Shot 2024-03-08 at 3 35 36 PM

Can you try pushing a file before and after your fix through https://github.com/DUNE/2x2_sim/blob/develop/run-validation/larndsim_validation.py?

marjoleinvannuland commented 2 months ago

https://github.com/DUNE/larnd-sim/blob/70d339192badf69f0b128d377cb30d0e29f8fbce/cli/simulate_pixels.py#L976 This is where the problem originates I believe. Replacing with selected_tracks['segment_id'] solves the problem with the segments that correspond to multiple triggers and event_ids. I am still going to do some checks and validation on the output (@krwood @YifanC @russellphysics) Edit: larndsim validation

YifanC commented 2 months ago

@marjoleinvannuland Cool! I think this looks like it!

marjoleinvannuland commented 2 months ago

If we need to use the index of the segment_id instead of the segment_id itself, I think this line should be outside the loop over the modules. I see no reason to not use the segment_id, but if we plan to make them not unique in a file in the foreseeable future maybe we should consider this? Edit: larndsim validation

jaafar-chakrani commented 2 months ago

FYI, since it was only briefly mentioned here, I opened issue #223 where I explain what I observe regarding the missing light channels