Move lumi normalization to topeft processor removing from latter scripts

jlawrenc commented 1 year ago

This add the luminosity normalization into the topeft.py processor. The luminosity normalization is also removed from script that use the output of topeft.py. The YieldTools.py script has been tested (via the get_yield_json.py) and has been shown to work. The other two changes in make_cr_and_sr_plots.py and dataDrivenEstimations.py have not been tested.

kmohrman commented 1 year ago

@jlawrenc I think the data card maker scripts (make_cards.py or topcoffea/modules/datacard_tools.py) must also scale by the lumi. Could you track down where those happen and remove that scaling as well?

kmohrman commented 1 year ago

@jlawrenc to test that the normalization looks ok for the CR plotting script, could you try to run it on one of the more recent CR pkl files and make sure the normalization looks reasonable (i.e. like what's in the AN)? You can run it with a command like the following:

time python make_cr_and_sr_plots.py -f /afs/crc.nd.edu/user/k/kmohrman/coffea_dir/pkl_files/oct06_fullRun2_withSys_anatest23_cr_np_noRFEnv.pkl.gz -o /your/out/dir -y UL17 -s

jlawrenc commented 1 year ago

Thank @kmohrman for the feedback. I have removed the other lumi normalization that you pointed out. I also tested out the make_cr_and_sr_plots.py and do not see any difference in the plots with the changes.

kmohrman commented 1 year ago

Thanks @jlawrenc for these updates. Would you be able to fix/check the following?

Fix the conflicts with respect to the technical_improvements branch
Check that the data cards produced by the datacard maker are the same before and after the normalization changes. You can do this by running a command like the following (using the example pkl file in the command) on both the master branch and on this branch:
```
python make_cards.py /afs/crc.nd.edu/user/k/kmohrman/Public/fullR2_files/pkl_files/dec18_fullRun2_withSys_anatest24_np.pkl.gz -C --do-nuisance --var-lst lj0pt ptz -d /some/out/dir --unblind --do-mc-stat
```
And the resulting cards can be compared against the ones here: /afs/cern.ch/user/k/kmohrman/public/top22006/datacards/at24v01/templates.

codecov[bot] commented 1 year ago

Codecov Report

Merging #351 (4bb1f37) into technical_improvements (7c0cb46) will decrease coverage by 0.24%. The diff coverage is 57.14%.

@@                    Coverage Diff                     @@
##           technical_improvements     #351      +/-   ##
==========================================================
- Coverage                   33.89%   33.65%   -0.24%     
==========================================================
  Files                          40       40              
  Lines                        6742     6670      -72     
==========================================================
- Hits                         2285     2245      -40     
+ Misses                       4457     4425      -32

Flag	Coverage Δ
unittests	`33.65% <57.14%> (-0.24%)`	:arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
analysis/topEFT/make_cr_and_sr_plots.py	`0.00% <ø> (ø)`
topcoffea/modules/datacard_tools.py	`67.75% <ø> (-0.54%)`	:arrow_down:
analysis/topEFT/topeft.py	`8.42% <33.33%> (-0.02%)`	:arrow_down:
topcoffea/modules/YieldTools.py	`42.51% <66.66%> (-0.38%)`	:arrow_down:
topcoffea/modules/dataDrivenEstimation.py	`67.88% <100.00%> (-1.54%)`	:arrow_down:

... and 2 files with indirect coverage changes

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

bryates commented 1 year ago

@jlawrenc it looks like just flake8 is failing now because the lumi functions are still imported in the files where you removed it (see CI) https://github.com/TopEFT/topcoffea/blob/18f1698943c6ae0ccf41c266191c20ad40184f54/topcoffea/modules/YieldTools.py#L6 https://github.com/TopEFT/topcoffea/blob/18f1698943c6ae0ccf41c266191c20ad40184f54/topcoffea/modules/dataDrivenEstimation.py#L4

jlawrenc commented 1 year ago

Thank @bryates, just for my own edification, this test will fail if something is imported but not used?

bryates commented 1 year ago

Thank @bryates, just for my own edification, this test will fail if something is imported but not used?

Flake8 is a style checker, so it wants to make sure you follow the PEP8 style guide. In this case, importing and not using a module is just a waste of memory, so it marks it is bad.

bryates commented 1 year ago

Looks like the new commit introduced a bunch of white spaces it doesn't like, as well as a few imports https://github.com/TopEFT/topcoffea/actions/runs/4556627573/jobs/8037225783?pr=351#step:4:37

bryates commented 1 year ago

Looks like the new commit introduced a bunch of white spaces it doesn't like, as well as a few imports https://github.com/TopEFT/topcoffea/actions/runs/4556627573/jobs/8037225783?pr=351#step:4:37

Awesome, flake8 now passes! Thanks @jlawrenc

jlawrenc commented 1 year ago

All requested tests have been made and all CI checks are passing

kmohrman commented 1 year ago

Thanks very much @jlawrenc for these updates and checks. Sorry if I just missed this, but have you been able to verify that the yields in the datacards for a "standard" pkl file (i.e. one that corresponds to a standard analysis run) look consistent as requested here?

I know we discussed in person that the check does not make sense with our "official" pkl files (since they have not had the lumi normalization applied in the processor), but I think we talked about a check that might make more sense would be to just reproduce an "official" pkl file with the updated processor and run the datacard maker on that.

kmohrman commented 1 year ago

Since @jlawrenc has answered all of @bryates's comments and confirmed at the meeting on Friday that he ran the checks described here to confirm that running the datacard maker on the full sized pkl file produced the expected yields, I will go ahead and merge into the technical_improvements branch now.

TopEFT / topeft

Move lumi normalization to topeft processor removing from latter scripts #351

Codecov Report