XENONnT / straxen

Streaming analysis for XENON
BSD 3-Clause "New" or "Revised" License
20 stars 32 forks source link

Make peaklets dtype flexiable #1299

Closed WenzDaniel closed 8 months ago

WenzDaniel commented 9 months ago

Side note this PR will change the lineage of everything > peaklets!

What does the code in this PR do / what does it improve?

Simple PR which makes the dtype propagation for peaklets to peaks more flexible. In the current implementation the dtype of merged S2s and peaks is fixed to the peaklets_dtype which is fixed to the peaks_dtype defined in strax. Already in the past this lead to an issue when we added new optional fields to peaks. For example when adding the hit timing https://github.com/AxFoundation/strax/blame/21cc96e011b0e4099138979791f34e8b1addedb7/strax/dtypes.py#L211 or top waveform https://github.com/AxFoundation/strax/blame/21cc96e011b0e4099138979791f34e8b1addedb7/strax/dtypes.py#L222. Especially, for the latter we added an if statement into merged_S2s to also include the top_data field when doing so on the peaklet level. Also for the SOM peaklet_classifcation this lead to issues because no additional fields could be added to peaklet_classifcation without also changing the peaklet data-type.

This is in general a not very strax-like behavior in which we know for each plugin its dependencies and the data-types of these dependencies.

Can you briefly describe how it works?

To circumvent this issue and to make the peaklet and peaklet_classification data-type more flexiable, I compute now the merged_S2s dtype based on the peaklets and peaklet_classifcation dtypes of the currently registered plugins. This also allows us to remove the additional if-statement in merged_s2s needed for the top waveform. In this way analysts can add as many fields to their own peaklet and classification plugins as they like. To be still able to control which parameters are propagated from peaklets to peaks (and higher up) I added a copy statement at the end of peaks which only copies all fields from peaklets which are provided by the Peaks plugin. The additional computation time is negligible.

Although the changes do not change the data provided by the plugins, I bumped the code version.

WenzDaniel commented 9 months ago

Edit: We still need the if statement in merged_s2s because the functions which build the merged S2s require the field...

FaroutYLq commented 9 months ago

I am happy to review this one. Isn't this PR more about peaks rather than peaklets? If it change peaklets lineage, the time for v14 reprocessing is gonna be way longer.

WenzDaniel commented 9 months ago

@FaroutYLq thanks, yes it only changes the lineage of everything above peaklets. I also have an idea how to fix the remaining test failure. This requires a change in strax though. I hope I can add it today.

WenzDaniel commented 9 months ago

@FaroutYLq this PR requires: https://github.com/AxFoundation/strax/pull/785 once it is merged it should work. If you have time it would be nice if you could review the other one as well.

coveralls commented 9 months ago

Coverage Status

coverage: 92.815% (-0.007%) from 92.822% when pulling 3c7b73387666e1cece81e506f54e534c45567c2f on make_peaklets_dtype_flexiable into f23319dd37a23dddcc37156e62d112e36c8deafc on master.

WenzDaniel commented 8 months ago

I changed back the plugin version and added back in redundant option to preserve the lineage.

dachengx commented 8 months ago

I have checked that the lineage of peaks does not change.