Closed nikitapond closed 9 months ago
Should probably also probably add dummy paths to the models, and add test function which generate dummy data to show working plotting.
Attention: 25 lines
in your changes are missing coverage. Please review.
Comparison is base (
42d0029
) 97.57% compared to head (46d7678
) 97.25%.:exclamation: Current head 46d7678 differs from pull request most recent head 159238c. Consider uploading reports for the commit 159238c to get more accurate results
Files | Patch % | Lines |
---|---|---|
puma/hlplots/plot_ftag.py | 89.39% | 14 Missing :warning: |
puma/hlplots/yutils.py | 89.70% | 7 Missing :warning: |
puma/hlplots/configs.py | 95.00% | 4 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Thnaks for this @nikitapond, I was hoping we could avoid adding too much code inbetween the HLAPI we already have and the yaml configs. Do you think it would be possible to strucure the yaml much more closely with the existing puma HLAPI classes (Tagger
and Results
)
e.g. we would have tagger.yaml
which corresponds to closely to the Tagger
constructor
tagger_defaults:
fb: 0.05
fc: 0.4
taggers:
- name: GN2v00
colour: red
fb: 0.1
which can be instantiated as
taggers = []
for tagger_cfg in tagger_cfgs:
taggers.append(Tagger(**tagger_defaults.update(tagger_cfg)))
And then we can have samples which each own their own taggers, such that we can do
for sample in samples:
this_taggers = [t for t in taggers if t.name in sample.taggers]
results.add_taggers_from_file([this_taggers]
Then we just have a script that combines the above with calls to the different plotting functions
results.plot_roc(**roc_args)
I think it was maybe overkill in the old repo to have multiple samples configured in the same config, it might be cleaner to just have a ttbar.yaml
and zprime.yaml
and then the top-level entry point make_all_plotz
or whatever gets called once for each config, and uses one Results
object to produce one set of plots, and then exits.
Let me know what you think
e.g. we would have
tagger.yaml
which corresponds to closely to theTagger
constructor
I think this is a nice idea in principle, but we need each tagger to contain some information about which evaluation sample a tagger uses. I.e, we couldn't just have
taggers:
- name: GN2v00
colour: red
fb: 0.1
We'd need something like
taggers:
- name: GN2v00
colour: red
sample_path: /path/to/evaluation/file.h5
fb: 0.1
And then before loading each tagger, pop the path, store in a separate list, and then ensure we select the correct file when running
for sample in samples:
idx = [i for i, t in enumerate(taggers) if t.name in sample.taggers]
for i in idx:
results.add_taggers_from_file([taggers[i]], file_paths[i],...)
Let me know if this sounds okay.
tagger_defaults: fb: 0.05 fc: 0.4 taggers: - name: GN2v00 colour: red fb: 0.1
So I've had a look back at this, and I think there's also a fairly significant issue with simplfying the structure so much. The results class uses 'tagger.name' to define the variables for probabilities, i.e name_pb, name_pc, etc. This then means this format wouldn't be able to plot the same tagger evaluated on different datasets, for example.
Could we have something like?
taggers:
- path: path/to/h5
tagger: {Tagger kwargs}
With also the option to have the path
key default to some globally defined path if it isn't spefied?
Could we have something like?
taggers: - path: path/to/h5 tagger: {Tagger kwargs}
Yeah this could work, so you envision a taggers_ttbar.yaml, taggers_zprime.yaml, plots_ttbar.yaml and plots_zprime.yaml?
With also the option to have the
path
key default to some globally defined path if it isn't spefied?
By this, you mean that if its got
taggers:
- path: path/to/h5
...
Where 'path' is a relative path, it does
path = Path(global) / path
Sort of thing?
No more that, if you had many tagger scores in the same file (i.e. the simplest case), you could avoid repeating yourself by doing
sample_path: path/to/h5
taggers:
- tagger: {Tagger 1 kwargs}
- tagger: {Tagger 2 kwargs}
- sample_path: override/global/sample_path
tagger: {Tagger 3 kwargs}
Anyway maybe it's a small point and could be added later. My main point is just that it would be great we should try and have init arguments neatly in their own config block wherever possible. Hence the suggestion of
taggers:
- sample_path: path/to/h5
tagger: {Tagger kwargs}
Rather than including the sample path in the tagger
config block.
Zooming out a bit, a lot of the complexity here is coming from the fact that the tagger class does not accept a sample path as input, when maybe it should. The issue with that is that we want to load taggers in the same files at the same time to avoid wasting disk reads due to compression on the jets dataset. There are two possible solutions:
The last option actually is probably not too much of a big change, maybe we should implement that first? i.e. add a path to the tagger init, and a function to the result class load()
, which finds all the groups of taggers with a common filename, and calls Results.add_taggers_from_file
for each group.
No more that, if you had many tagger scores in the same file (i.e. the simplest case)
Okay I get you now, I think my misunderstanding on what you're saying is that I've been mostly using this on evaluated training files, and so have never had to load more than one tagger from a given file.
Anyway maybe it's a small point and could be added later. My main point is just that it would be great we should try and have init arguments neatly in their own config block wherever possible.
- add the sample path to the Tagger class init but lazily load data later on (e.g. handled by the results class)
The last option actually is probably not too much of a big change, maybe we should implement that first? i.e. add a path to the tagger init, and a function to the result class
load()
, which finds all the groups of taggers with a common filename, and callsResults.add_taggers_from_file
for each group.
I think this makes a lot of sense, I'll get to starting to implement this.
In terms of next steps for this, I'd suggest to make an issue to track things that don't need to be in this first MR:
And then for this MR just work on documentation and tests so others can start using this sooner rather than later.
This is looking great, are you able to increase the test coverage a bit so we can avoid the overall drop in coverage? Maybe also @dkobylianskii wants to take a look, Dmitrii you had some experience with the old GNNJetTagger repo plotting stuff right?
So, for the other folds, it feels excessive to include all configs here, but I'm wondering if we need to keep track somewhere of what hash we use for each fold? I.e, this config has train/val : hash%4 != 3, test : hash % 4 == 3. For each other config, we simply need to change this '3' to another number. I guess the logical approach would be to call them 'fold_n' where 'n' in the above config is 3?
Edit: Posted on the wrong PR.
Also one more request, would it be possible to add a few plots to the efficiency profiles, showing the rejection at constant efficiency versus PU and Lxy (ideally we would add number of b-tracks also at some point but since that one requires code changes I would save it for a second MR)
Also one more request, would it be possible to add a few plots to the efficiency profiles, showing the rejection at constant efficiency versus PU and Lxy (ideally we would add number of b-tracks also at some point but since that one requires code changes I would save it for a second MR)
The mock-file doesn't currently have either of these variables, does eta and ghost_hadron_pt work?
Tagging @dkobylianskii for a review/merge. The coverage drops but this is the order of fractions of a percent.
Thanks @nikitapond, does this close #227?
Thanks @nikitapond, does this close #227?
Not really. The hlplots interface still expects a 'peff_var' to be defined. This does streamline, by allowing the user to define peff plots over different variables. The various plot configs are then sorted and grouped based on the x-variable, and then plotted, to minimise the overall amount of re-loading, see here.
So, this is far from the most efficient way of doing it, but in my experience it doesn't take too long to load, and so this does make the overall user experience of making peff plots simpler.
Hi @nikitapond
Thanks a lot for this work! I will review it over the weekends. I added MR (#233) to close #227; we can merge it before this and update the Yuma code accordingly. What do you think?
Hi @nikitapond
Thanks a lot for this work! I will review it over the weekends. I added MR (#233) to close #227; we can merge it before this and update the Yuma code accordingly. What do you think?
Hey @dkobylianskii this sounds good to me. Once #233 is merged I'll make any changes here.
Thanks both! #233 is merged!
Pre-commit is complaining still - will merge as soon as this fixed, thanks!
Pre-commit is complaining still - will merge as soon as this fixed, thanks!
Yeah I see, not sure why but locally pre-commit is refusing to run, seems to want to access python 3.1 instead of 3.10....
Thanks @nikitapond!
last thing - could you add a changelog entry?
Summary
This pull request introduces the following changes
Still a work in progress:
Requires #208