HEPData / hepdata_lib

Library for getting your data into HEPData
https://hepdata-lib.readthedocs.io
MIT License
15 stars 39 forks source link

Dealing with symmetric and asymmetric uncertainties at the same time #213

Closed frgigr closed 1 year ago

frgigr commented 1 year ago

Hi! This MR adds more flexibility when dealing with distributions/plots having low non-zero yields: in these cases, due to the limited statistics, it may happen that the uncertainty is symmetric, but the relative value being larger with respect to the nominal value. Therefore, it's a common practice to cap the negative variation at the nominal value to avoid a negative yield, being unphysical.

From a technical point of view, uncertainties are given as asymmetric, but when the up/down variations have opposite sign only, thus sharing the same magnitude, they are considered as symmetric. On the other hand, nothing changes for asymmetric ones. So, the resulting yaml file will have symmetric or asymmetric uncertainties accordingly, improving general readability in case of mixed entries.

Finally, as uncertainties are floating-point numbers, the decimal library is used to perform the mathematical comparison and calculation(s).

Hope it is fine but, please, let me know if you have comments and/or questions. Thanks in advance, Francesco


:books: Documentation preview :books:: https://hepdata-lib--213.org.readthedocs.build/en/213/

clelange commented 1 year ago

Thanks for your contribution! I won't have time this week to review the code even though the changes are small, but I'll aim to get back to this by the middle of next week. One issue I see from having a brief look is that at the moment we're still trying to support Python2 and I think the decimal class doesn't exist there. It might be a good time to drop support for Python2 at this point though.

Would you mind thinking about a test we could implement for the proposed change?

frgigr commented 1 year ago

Hi @clelange,

Thanks for your comments! No problem, I can wait for you to get back here.

In the meanwhile, if the support for Python2 is still needed, a temporary solution (to avoid using the decimal class) would be to drop the line where the sum_unc variable is created and to replace the if-clause with the following one, although less robust:

if abs( float(unc.values[i][0]) ) == abs( float(unc.values[i][1]) )
frgigr commented 1 year ago

Just added a dummy test, to cross-check the change in this MR: as it involves uncertainties, I added a new method in the dedicated file. Thanks!

codecov-commenter commented 1 year ago

Codecov Report

Merging #213 (06cebd9) into master (74b90d8) will increase coverage by 0.05%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #213      +/-   ##
==========================================
+ Coverage   87.75%   87.80%   +0.05%     
==========================================
  Files           4        4              
  Lines         964      968       +4     
  Branches      188      189       +1     
==========================================
+ Hits          846      850       +4     
  Misses         86       86              
  Partials       32       32              
Flag Coverage Δ
unittests-3.8 87.80% <100.00%> (+0.05%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
hepdata_lib/__init__.py 90.00% <100.00%> (+0.13%) :arrow_up:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

clelange commented 1 year ago

I made a small change to the test code to make the linter happy. I also tested that the code does what it's supposed to do. I'll probably merge tomorrow, have to go now.