scikit-hep / pyBumpHunter

Python implementation of the BumpHunter algorithm used by HEP community.
BSD 3-Clause "New" or "Revised" License
17 stars 9 forks source link

Update api #5

Closed lovaslin closed 3 years ago

eduardo-rodrigues commented 3 years ago

Hi @lovaslin, cc @mayou36 - I noticed these great updates but also see that you simply updated the version as a patch version from 0.3.0 to 0.3.1. That's not good and against the "rules", right ;-). A patch version update is for bug fixes and trivial things. A change of API often even calls for a major version change, which would be 1.0.0 for you. At the very least you should do 0.4.0. But 1.0.0 actually seems justified in your case. My 2 cents :-).

lovaslin commented 3 years ago

Hello,

In fact, several names has been changed, but there is no real new thing in pyBumpHunter. All the old names are still available with deprecation warning. Since old codes are technically not broken, going to v1.0.0 seems a bit to much. But maybe v0.4.0 could be more appropriate ...

However, I have plans for pyBumpHunter that would definitely require to go to v1.0.0. I would like to separate the "histograming" and the "scaning" part. For that I am trying to create a DataHandler class that manage the all the histograms in one object, with possibility to read data in several different format. It should make pyBumpHunter easier to include in a analysis workflow that uses other tools, like boost-historgam (for example).

De: "Eduardo Rodrigues" @.> À: "scikit-hep/pyBumpHunter" @.> Cc: "Louis VASLIN" @.>, "Mention" @.> Envoyé: Vendredi 9 Avril 2021 10:31:58 Objet: Re: [scikit-hep/pyBumpHunter] Update api (#5)

Hi [ https://github.com/lovaslin | @lovaslin ] , cc [ https://github.com/mayou36 | @mayou36 ] - I noticed these great updates but also see that you simply updated the version as a patch version from 0.3.0 to 0.3.1. That's not good and against the "rules", right ;-). A patch version update is for bug fixes and trivial things. A change of API often even calls for a major version change, which would be 1.0.0 for you. At the very least you should do 0.4.0. But 1.0.0 actually seems justified in your case. My 2 cents :-).

— You are receiving this because you were mentioned. Reply to this email directly, [ https://github.com/scikit-hep/pyBumpHunter/pull/5#issuecomment-816517655 | view it on GitHub ] , or [ https://github.com/notifications/unsubscribe-auth/APPWEU2SL43AZFCKVFYPEL3TH23P5ANCNFSM42UPTA3A | unsubscribe ] .

jonas-eschle commented 3 years ago

Indeed, I agree with @eduardo-rodrigues . To be fair, mostly it's only deprecations but such changes should rather go into a different update. The expectation is that one can upgrade in the minor version without any changes that affect the behavior. For 1.0.0 I would suggest to remove the old API then, so 0.4.0 seems a good pick; try to use that in the future.

On another note, when you do a large PR in general, but especially for releases, maybe make sure to get a review from people first, especially in the beginning. This slows down things a bit but increases stability and - mainly - makes other people aware of your code. This is crucial since having it in Scikit-HEP vs as a private repo means that a) others will also maintain it and need therefore to see your code from time to time b) it is also not "your" repository anymore but "ours" and it acts as a representative of Scikit-HEP, meaning it is also good to give people a chance to add comments. It all goes with the fact that it is an entirely different story to write a useful script vs having a well-designed, stable package in Scikit-HEP, meaning you can also depend on the support of others (and their golden experience).

Not to mention that this helps avoiding mistakes which we all commit from time to time :)

And oc smaller PRs etc are all fine, it's just a help in the end if someone reviews

jonas-eschle commented 3 years ago

For that I am trying to create a DataHandler class that manage the all the histograms in one object, with possibility to read data in several different format.

That sounds good, maybe you can even go a step further and just leave that to other libraries and e.g. stick to only accept boost-histograms or similar? Just to put as few things as possible into this library basically and we can assume that a user has access to the whole Scikit-HEP stack usually.

But please go ahead, that sounds really good, cc me in the PR, interested to see where it is going and gladly helping with the design

eduardo-rodrigues commented 3 years ago

Thanks for these excellent comments, @mayou36, which I totally subscribe to, as you can imagine ;-). Indeed we are an org / a team / a bunch of collaborators. Scikit-HEP should never be seen as just "a common place to put some packages".

eduardo-rodrigues commented 3 years ago

For that I am trying to create a DataHandler class that manage the all the histograms in one object, with possibility to read data in several different format.

That sounds good, maybe you can even go a step further and just leave that to other libraries and e.g. stick to only accept boost-histograms or similar? Just to put as few things as possible into this library basically and we can assume that a user has access to the whole Scikit-HEP stack usually.

But please go ahead, that sounds really good, cc me in the PR, interested to see where it is going and gladly helping with the design

For these developments I think you should involve @henryiii since he designed https://github.com/scikit-hep/uhi and may have a lot to help you with ...

lovaslin commented 3 years ago

Understood. Sorry for my mistakes. I am new to this and I am not yet used to all the good practices. I will try to do better next time.

About the DataHandler class, I wanted to create a object that contains several histograms with shared bins. I will see if it is already possible to do something like that with boost-histogram or another Scikit-HEP related package.

De: "Eduardo Rodrigues" @.> À: "scikit-hep" @.> Cc: "Louis VASLIN" @.>, "Mention" @.> Envoyé: Vendredi 9 Avril 2021 10:55:18 Objet: Re: [scikit-hep/pyBumpHunter] Update api (#5)

BQ_BEGIN

For that I am trying to create a DataHandler class that manage the all the histograms in one object, with possibility to read data in several different format.

That sounds good, maybe you can even go a step further and just leave that to other libraries and e.g. stick to only accept boost-histograms or similar? Just to put as few things as possible into this library basically and we can assume that a user has access to the whole Scikit-HEP stack usually.

But please go ahead, that sounds really good, cc me in the PR, interested to see where it is going and gladly helping with the design BQ_END

For these developments I think you should involve [ https://github.com/henryiii | @henryiii ] since he designed [ https://github.com/scikit-hep/uhi | https://github.com/scikit-hep/uhi ] and may have a lot to help you with ...

— You are receiving this because you were mentioned. Reply to this email directly, [ https://github.com/scikit-hep/pyBumpHunter/pull/5#issuecomment-816532388 | view it on GitHub ] , or [ https://github.com/notifications/unsubscribe-auth/APPWEU3LDG2VVJXJ3RR7BEDTH26HNANCNFSM42UPTA3A | unsubscribe ] .

eduardo-rodrigues commented 3 years ago

No worries with "mistakes". We all keep improving and we should try to help ourselves towards "the best possible packages" :-).

jonas-eschle commented 3 years ago

Couldn't agree more, that's why were writing these things, and you seem to understand them and apply them well, that's all fine. Good to have you on board!

henryiii commented 3 years ago

On naming, if you want to get technical - while SerVer is a little murky on 0.x versions, in general (hardcoded into Poetry, actually), 0.x -> 0.x+1 is considered a major bump, and 0.x.y -> 0.x.y+1 is a minor bump; you need to add another digit if you really want patch versions (usually you don't care as much for 0.x libraries). You should never remove or rename something except in major bumps, though, so 0.4 would be correct - that's a breaking change. You can add features in a minor bump, but never (intentionally) break code that used to work. It's quite up to you when you jump to 1.0; at that point, then it's the usual major.minor.patch.

Fine to learn by doing, we don't expect perfection - I'd certainly not be here if we did. :)

Happy to be involved in histogram discussions.