Yuma - Yaml configs for pUMA

nikitapond commented 1 year ago

Summary

This pull request introduces the following changes

Introduces a yaml config based plotting api above the current high-level api
Allows for plotting the main ftag plots, ROCs, fraction scans, discriminant, probability plots, and performance vs var
Config design follows that of the original GN1 tagger

Still a work in progress:

Add support for including tau rejection
Additional details in 'atlas second tag' for some plots, such as performance vs var
Documentation and tests
Add yaml config plotting of jet and track variables

Requires #208

nikitapond commented 1 year ago

Should probably also probably add dummy paths to the models, and add test function which generate dummy data to show working plotting.

codecov-commenter commented 1 year ago

Codecov Report

Attention: 25 lines in your changes are missing coverage. Please review.

Comparison is base (42d0029) 97.57% compared to head (46d7678) 97.25%.

:exclamation: Current head 46d7678 differs from pull request most recent head 159238c. Consider uploading reports for the commit 159238c to get more accurate results

Files	Patch %	Lines
puma/hlplots/plot_ftag.py	89.39%	14 Missing :warning:
puma/hlplots/yutils.py	89.70%	7 Missing :warning:
puma/hlplots/configs.py	95.00%	4 Missing :warning:

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #204 +/- ## ========================================== - Coverage 97.57% 97.25% -0.32% ========================================== Files 41 45 +4 Lines 4035 4475 +440 ========================================== + Hits 3937 4352 +415 - Misses 98 123 +25 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

samvanstroud commented 1 year ago

Thnaks for this @nikitapond, I was hoping we could avoid adding too much code inbetween the HLAPI we already have and the yaml configs. Do you think it would be possible to strucure the yaml much more closely with the existing puma HLAPI classes (Tagger and Results)

e.g. we would have tagger.yaml which corresponds to closely to the Tagger constructor

tagger_defaults:
  fb: 0.05
  fc: 0.4

taggers:
  - name: GN2v00
    colour: red
    fb: 0.1

which can be instantiated as

taggers = []
for tagger_cfg in tagger_cfgs:
  taggers.append(Tagger(**tagger_defaults.update(tagger_cfg)))

And then we can have samples which each own their own taggers, such that we can do

for sample in samples:
  this_taggers = [t for t in taggers if t.name in sample.taggers]
  results.add_taggers_from_file([this_taggers]

Then we just have a script that combines the above with calls to the different plotting functions

results.plot_roc(**roc_args)

I think it was maybe overkill in the old repo to have multiple samples configured in the same config, it might be cleaner to just have a ttbar.yaml and zprime.yaml and then the top-level entry point make_all_plotz or whatever gets called once for each config, and uses one Results object to produce one set of plots, and then exits.

Let me know what you think

nikitapond commented 1 year ago

e.g. we would have tagger.yaml which corresponds to closely to the Tagger constructor

I think this is a nice idea in principle, but we need each tagger to contain some information about which evaluation sample a tagger uses. I.e, we couldn't just have

taggers: 
     - name: GN2v00
        colour: red
        fb: 0.1

We'd need something like

taggers: 
     - name: GN2v00
        colour: red
        sample_path: /path/to/evaluation/file.h5
        fb: 0.1

And then before loading each tagger, pop the path, store in a separate list, and then ensure we select the correct file when running


for sample in samples:
  idx = [i for i, t in enumerate(taggers) if t.name in sample.taggers]
  for i in idx:
      results.add_taggers_from_file([taggers[i]], file_paths[i],...)

Let me know if this sounds okay.

nikitapond commented 1 year ago

tagger_defaults:
  fb: 0.05
  fc: 0.4

taggers:
  - name: GN2v00
    colour: red
    fb: 0.1

So I've had a look back at this, and I think there's also a fairly significant issue with simplfying the structure so much. The results class uses 'tagger.name' to define the variables for probabilities, i.e name_pb, name_pc, etc. This then means this format wouldn't be able to plot the same tagger evaluated on different datasets, for example.

samvanstroud commented 1 year ago

Could we have something like?

taggers:
  - path: path/to/h5
    tagger: {Tagger kwargs}

With also the option to have the path key default to some globally defined path if it isn't spefied?

nikitapond commented 1 year ago

Could we have something like?
taggers:
  - path: path/to/h5
    tagger: {Tagger kwargs}

Yeah this could work, so you envision a taggers_ttbar.yaml, taggers_zprime.yaml, plots_ttbar.yaml and plots_zprime.yaml?

With also the option to have the path key default to some globally defined path if it isn't spefied?

By this, you mean that if its got

taggers:
    - path: path/to/h5
      ...

Where 'path' is a relative path, it does

path = Path(global) / path

Sort of thing?

samvanstroud commented 1 year ago

No more that, if you had many tagger scores in the same file (i.e. the simplest case), you could avoid repeating yourself by doing

sample_path: path/to/h5

taggers:
  - tagger: {Tagger 1 kwargs}
  - tagger: {Tagger 2 kwargs}

  - sample_path: override/global/sample_path
    tagger: {Tagger 3 kwargs}

Anyway maybe it's a small point and could be added later. My main point is just that it would be great we should try and have init arguments neatly in their own config block wherever possible. Hence the suggestion of

taggers:
  - sample_path: path/to/h5
    tagger: {Tagger kwargs}

Rather than including the sample path in the tagger config block.

Zooming out a bit, a lot of the complexity here is coming from the fact that the tagger class does not accept a sample path as input, when maybe it should. The issue with that is that we want to load taggers in the same files at the same time to avoid wasting disk reads due to compression on the jets dataset. There are two possible solutions:

fix the problem at the source https://gitlab.cern.ch/atlas-flavor-tagging-tools/training-dataset-dumper/-/issues/123
add the sample path to the Tagger class init but lazily load data later on (e.g. handled by the results class)

The last option actually is probably not too much of a big change, maybe we should implement that first? i.e. add a path to the tagger init, and a function to the result class load(), which finds all the groups of taggers with a common filename, and calls Results.add_taggers_from_file for each group.

nikitapond commented 1 year ago

No more that, if you had many tagger scores in the same file (i.e. the simplest case)

Okay I get you now, I think my misunderstanding on what you're saying is that I've been mostly using this on evaluated training files, and so have never had to load more than one tagger from a given file.

Anyway maybe it's a small point and could be added later. My main point is just that it would be great we should try and have init arguments neatly in their own config block wherever possible.

add the sample path to the Tagger class init but lazily load data later on (e.g. handled by the results class)

The last option actually is probably not too much of a big change, maybe we should implement that first? i.e. add a path to the tagger init, and a function to the result class load(), which finds all the groups of taggers with a common filename, and calls Results.add_taggers_from_file for each group.

I think this makes a lot of sense, I'll get to starting to implement this.

samvanstroud commented 1 year ago

In terms of next steps for this, I'd suggest to make an issue to track things that don't need to be in this first MR:

[ ] Add support for including tau rejection
[ ] Additional details in 'atlas second tag' for some plots, such as performance vs var
[ ] Add yaml config plotting of jet and track variables

And then for this MR just work on documentation and tests so others can start using this sooner rather than later.

samvanstroud commented 11 months ago

This is looking great, are you able to increase the test coverage a bit so we can avoid the overall drop in coverage? Maybe also @dkobylianskii wants to take a look, Dmitrii you had some experience with the old GNNJetTagger repo plotting stuff right?

nikitapond commented 11 months ago

So, for the other folds, it feels excessive to include all configs here, but I'm wondering if we need to keep track somewhere of what hash we use for each fold? I.e, this config has train/val : hash%4 != 3, test : hash % 4 == 3. For each other config, we simply need to change this '3' to another number. I guess the logical approach would be to call them 'fold_n' where 'n' in the above config is 3?

Edit: Posted on the wrong PR.

samvanstroud commented 11 months ago

Also one more request, would it be possible to add a few plots to the efficiency profiles, showing the rejection at constant efficiency versus PU and Lxy (ideally we would add number of b-tracks also at some point but since that one requires code changes I would save it for a second MR)

nikitapond commented 11 months ago

Also one more request, would it be possible to add a few plots to the efficiency profiles, showing the rejection at constant efficiency versus PU and Lxy (ideally we would add number of b-tracks also at some point but since that one requires code changes I would save it for a second MR)

The mock-file doesn't currently have either of these variables, does eta and ghost_hadron_pt work?

nikitapond commented 9 months ago

Tagging @dkobylianskii for a review/merge. The coverage drops but this is the order of fractions of a percent.

samvanstroud commented 9 months ago

Thanks @nikitapond, does this close #227?

nikitapond commented 9 months ago

Thanks @nikitapond, does this close #227?

Not really. The hlplots interface still expects a 'peff_var' to be defined. This does streamline, by allowing the user to define peff plots over different variables. The various plot configs are then sorted and grouped based on the x-variable, and then plotted, to minimise the overall amount of re-loading, see here.

So, this is far from the most efficient way of doing it, but in my experience it doesn't take too long to load, and so this does make the overall user experience of making peff plots simpler.

dkobylianskii commented 9 months ago

Hi @nikitapond

Thanks a lot for this work! I will review it over the weekends. I added MR (#233) to close #227; we can merge it before this and update the Yuma code accordingly. What do you think?

nikitapond commented 9 months ago

Hi @nikitapond

Thanks a lot for this work! I will review it over the weekends. I added MR (#233) to close #227; we can merge it before this and update the Yuma code accordingly. What do you think?

Hey @dkobylianskii this sounds good to me. Once #233 is merged I'll make any changes here.

samvanstroud commented 9 months ago

Thanks both! #233 is merged!

samvanstroud commented 9 months ago

Pre-commit is complaining still - will merge as soon as this fixed, thanks!

nikitapond commented 9 months ago

Pre-commit is complaining still - will merge as soon as this fixed, thanks!

Yeah I see, not sure why but locally pre-commit is refusing to run, seems to want to access python 3.1 instead of 3.10....

samvanstroud commented 9 months ago

Thanks @nikitapond!

samvanstroud commented 9 months ago

last thing - could you add a changelog entry?

umami-hep / puma

Yuma - Yaml configs for pUMA #204

Summary

Codecov Report