cta-observatory / cta-lstchain

LST prototype testbench chain
https://cta-observatory.github.io/cta-lstchain/
BSD 3-Clause "New" or "Revised" License
24 stars 77 forks source link

Reduce number of trees in RF regressors #1294

Closed moralejo closed 3 days ago

moralejo commented 2 weeks ago

Reduced to 150 to 50 trees, physics performance is not impacted, while memory needs will decrease (and speed increase)

codecov[bot] commented 2 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 73.50%. Comparing base (e4a5b7a) to head (f154d2e).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1294 +/- ## ========================================== + Coverage 73.49% 73.50% +0.01% ========================================== Files 134 134 Lines 14210 14210 ========================================== + Hits 10443 10445 +2 + Misses 3767 3765 -2 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

moralejo commented 2 weeks ago

Some comparisons here: RF_tree_reduction.pdf

moralejo commented 2 weeks ago

why not changing the classifiers parameters as well?

The tests for that (since they involve background, ideally from real data) are more complicated. The main goal of this is to reduce the memory footprint of the RF training & application, and the biggest RFs are those for regression. Currently it is 67% for regression, 33% for classification (according to models' .sav files). With this change the total RF size will become ~55% of what it is now, with a share of 60% for classification, 40% for regression. If we aim for a further reduction we should indeed target the classifiers.

morcuended commented 3 days ago

there is something wrong with mamba setup step in the CI (is taking > 50 min)

moralejo commented 3 days ago

there is something wrong with mamba setup step in the CI (is taking > 50 min)

Can we ignore it?

morcuended commented 3 days ago

Tests will not run either until we merge #4563 or the problem with mamba is solved. Anyway, I think the changes here are not tested. I'd say it can be merged