Closed moralejo closed 2 weeks ago
Attention: Patch coverage is 85.36585%
with 6 lines
in your changes missing coverage. Please review.
Project coverage is 73.27%. Comparing base (
e097079
) to head (5974bb8
). Report is 21 commits behind head on main.
Files | Patch % | Lines |
---|---|---|
lstchain/reco/dl1_to_dl2.py | 78.57% | 3 Missing :warning: |
lstchain/reco/utils.py | 85.00% | 3 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
It seems to work well:
Gammaness cut is 0.5 for the standard, and tuned to get the same amount of events at zd~18 deg.
With "standard RFs", aside from the (smooth) physical dependence with zenith, we have jumps which result from the non-smoothness of the evolution of the number of training events with pointing, along declination lines. The jump above (in the blue histogram) is located at the mid-point between two training nodes.
These artifacts can be removed with the changes proposed here. In all likelihood they are partly responsible for the few-percent systematic errors in flux estimation that we have.
@vuillaut The training with the Crab line took nearly 10 hours in cp03. Is that normal, or perhaps the use of weights makes it slower?
@vuillaut The training with the Crab line took nearly 10 hours in cp03. Is that normal, or perhaps the use of weights makes it slower?
Hi
Using how many cores ?
But that sounds right
This is to avoid jumps in RF performance in the "mid-points" between training pointing nodes, just because the event statistics occasionally have jumps between neighboring nodes.
The idea is to use "sample_weight", i.e. a weight for each event in the training set, calculated such that each of the pointing nodes has the same total weight.
The calculation is done with all events that are passed to the RFs. Of course, the energy (and impact) range of those events between culmination and high zenith change a lot, so "same number of events" does not mean much when comparing such extremes, but the point here is to avoid steps between neighboring bins (and for those, indeed, the numbers of events are comparable quantities). Only if the impact and energy ranges were poorly chosen (and distributions truncated) would this normalization result in performance jumps like the ones we have without it.