cmstas / HggAnalysisDev

3 stars 7 forks source link

Updates to pre-selections and json files #11

Closed bsathian closed 3 years ago

bsathian commented 3 years ago

Thanks for the PR, Balaji. A few comments:

  • Why do you change to abs weight? In practice, this shouldn't make much of a difference for BDT training, but we get the "correct" distributions when we include negative weights.

xgboost does not support negative weights anymore (https://discuss.xgboost.ai/t/does-xgboost-support-negative-instance-weights/1925/5). Turns out negative weights don't work the way we expect it to work - xgboost used to ignore negative weight events. This was in fact explicitly addressed in an xgboost PR (https://github.com/dmlc/xgboost/pull/6115). So I wouldn't want to use negative weights especially given that a PR was opened to address this specific issue in xgboost

  • Can we undo the import numpy -> import numpy as np OR just change it everywhere to np

Okay. I'll change it to import numpy everywhere

  • Can you clean up the changes to analysis_selections.py? Currently we have a cut on n_lep + n_tau >= 1 and a cut on n_tau >= 1, so the first one is not needed. And add the new cut to the cut_diagnostics object

Sure