scikit-hep / pyBumpHunter

Python implementation of the BumpHunter algorithm used by HEP community.
BSD 3-Clause "New" or "Revised" License
17 stars 9 forks source link

Multi chan #10

Closed lovaslin closed 2 years ago

lovaslin commented 2 years ago

Hello,

This is the next major update of pyBumpHunter. It will bring a few bug fixes and improvement, and also the multi-channel combination option. This option allows to search for a consistent bump across multiple channels, both with 1D and 2D histograms.

About the problem with setup.py (issue#8) I removed the problematic import. However, I am not sure what to do to remove completely the setup.py file without breaking anything. Is there any procedure I need to follow for that ?

I plan to do the release immediately after merging. So if someone find any bugs or if something important is missing, do not hesitate to ping me before I do anything.

Have a nice day !

lovaslin commented 2 years ago

I just requested a review by 3 people, just to make sure I am not doing anything wrong.

henryiii commented 2 years ago

I believe we are rapidly approaching a point where dropping setup.py will have no consequences, so it's up to you if you'd like to be a guinea pig for this. It's a pure Python library, so almost everyone will be getting wheels anyway, and wheels don't even contain the setup.py if you have one. The only difference is the SDist will requires a slightly newish version of pip, and getting an editable install will require a very new version of pip (from this year, IIRC 21.1 or so). But only developers should be doing editable installs, so that's likely okay.

What you have now is perfectly fine and inline with our other packages. Part of the reason is that extras can't depend on other extras until pip 21.2 (and that affects wheels, so all users), so if you have several complicated extras, then setup.py is still nice for combining extras.

jonas-eschle commented 2 years ago

I also understand that you want to create a lot of refactoring, and that could be worth it (e.g. just merging the single, multi, maybe even an ND), so fine if things here aren't perfect. But I would still make sure in general to move towards the direction of refactoring, and not to add a lot of legacy code.

But again, seems all going into the right direction ;)

Let me know once you polished things and I'll do a final review

lovaslin commented 2 years ago

This new update aims at solving the issue of the global significance going to infinity when the global p-value is exactly 0. I also improved the signal injection with the possibility to control the number of background-only and background+signal pseudo-experiments separately.

I think the new version is about ready to be merged in master. The next version of pyBumpHunter will be published on PyPI just after merging.

I plan to change a few things in the API in the next version. This will include the way pyBumpHunter handles the input histograms (and will most probably break compatibility with the current API). I think it will be much easier to unify the single-channel and multi-channel mode with this new API.

jonas-eschle commented 2 years ago

That looks indeed better, thanks for the update Let's resolve the conflicts and then go ahead, Looking forward to the simplified API!