cta-observatory / cta-lstchain

LST prototype testbench chain
https://cta-observatory.github.io/cta-lstchain/
BSD 3-Clause "New" or "Revised" License
22 stars 77 forks source link

Computation of pointing-dependent weights for MC events #1259

Closed moralejo closed 2 weeks ago

moralejo commented 4 weeks ago

This is to avoid jumps in RF performance in the "mid-points" between training pointing nodes, just because the event statistics occasionally have jumps between neighboring nodes.

The idea is to use "sample_weight", i.e. a weight for each event in the training set, calculated such that each of the pointing nodes has the same total weight.

The calculation is done with all events that are passed to the RFs. Of course, the energy (and impact) range of those events between culmination and high zenith change a lot, so "same number of events" does not mean much when comparing such extremes, but the point here is to avoid steps between neighboring bins (and for those, indeed, the numbers of events are comparable quantities). Only if the impact and energy ranges were poorly chosen (and distributions truncated) would this normalization result in performance jumps like the ones we have without it.

codecov[bot] commented 4 weeks ago

Codecov Report

Attention: Patch coverage is 85.36585% with 6 lines in your changes missing coverage. Please review.

Project coverage is 73.27%. Comparing base (e097079) to head (5974bb8). Report is 21 commits behind head on main.

Files Patch % Lines
lstchain/reco/dl1_to_dl2.py 78.57% 3 Missing :warning:
lstchain/reco/utils.py 85.00% 3 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1259 +/- ## ========================================== + Coverage 73.05% 73.27% +0.22% ========================================== Files 134 134 Lines 14039 14081 +42 ========================================== + Hits 10256 10318 +62 + Misses 3783 3763 -20 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

moralejo commented 4 weeks ago

It seems to work well: image

Gammaness cut is 0.5 for the standard, and tuned to get the same amount of events at zd~18 deg.

With "standard RFs", aside from the (smooth) physical dependence with zenith, we have jumps which result from the non-smoothness of the evolution of the number of training events with pointing, along declination lines. The jump above (in the blue histogram) is located at the mid-point between two training nodes.

These artifacts can be removed with the changes proposed here. In all likelihood they are partly responsible for the few-percent systematic errors in flux estimation that we have.

moralejo commented 4 weeks ago

@vuillaut The training with the Crab line took nearly 10 hours in cp03. Is that normal, or perhaps the use of weights makes it slower?

vuillaut commented 4 weeks ago

@vuillaut The training with the Crab line took nearly 10 hours in cp03. Is that normal, or perhaps the use of weights makes it slower?

Hi Using how many cores ?
But that sounds right