JuliaHEP / UpROOT.jl

Julia package to access CERN ROOT files, wraps Python package uproot
Other
15 stars 3 forks source link

Add support for reading ROOT histograms #17

Closed oschulz closed 3 years ago

oschulz commented 3 years ago

CC @gipert

codecov[bot] commented 3 years ago

Codecov Report

Merging #17 (8ab1d53) into master (0c45d32) will increase coverage by 4.61%. The diff coverage is 70.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #17      +/-   ##
==========================================
+ Coverage   39.10%   43.72%   +4.61%     
==========================================
  Files           8        8              
  Lines         202      215      +13     
==========================================
+ Hits           79       94      +15     
+ Misses        123      121       -2     
Impacted Files Coverage Δ
src/UpROOT.jl 100.00% <ø> (ø)
src/testdata.jl 100.00% <ø> (ø)
src/pyjlconv.jl 70.00% <68.42%> (+9.13%) :arrow_up:
src/tdirectory.jl 87.50% <100.00%> (+12.50%) :arrow_up:
src/ttree.jl 48.00% <0.00%> (+1.33%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 0c45d32...8ab1d53. Read the comment docs.

gipert commented 3 years ago

Sorry for the delay Oli, haven't got time until now. I was thinking about this: it's a pity that StatsBase.Histogram does not support under/over-flow bins – it's something we are losing when converting from uproot hists. Do you think it would make sense to extend Histogram here to include this feature? Do you have brighter ideas?

oschulz commented 3 years ago

it's a pity that StatsBase.Histogram does not support under/over-flow bins – it's something we are losing when converting from uproot hists. Do you think it would make sense to extend Histogram here to include this feature? Do you have brighter ideas?

I've never used ROOTs over- und underflow bins much, personally, but it is a loss of information. We'd have to create a custom histogram class though - certainly possible, but would be more than a few lines of code. Maybe defer until a later time?

gipert commented 3 years ago

Maybe defer until a later time?

Yes, it's not urgent, had to make use of under/over-flow bins only couple of times.

Another thing: do you know if there is any way to know whether a ROOT histogram represents a density? Would be cool to be able to set Histogram.isdensity. I will investigate

gipert commented 3 years ago

So there is an interesting bit that could be used to set Histogram.isdensity: kIsAverage. I'm not sure I can get that with uproot tho...

gipert commented 3 years ago

Indeed: https://github.com/scikit-hep/uproot3-methods/issues/99

gipert commented 3 years ago

From what concerns myself you can merge this and tag a new version, once #18 is in!

oschulz commented 3 years ago

So there is an interesting bit that could be used to set Histogram.isdensity: kIsAverage. I'm not sure I can get that with uproot tho...

Do you know the semantics of that? StatsBase.Histogram has several normalization modes (mainly density-normalization and full normalization), both take non-homogenous binning into account. I don't know that ROOT's kIsAverage does, exactly.

oschulz commented 3 years ago

you can merge this and tag a new version, once #18 is in!

Want to try your hand at that kIsAverage status bit first (from what I understand, we can get it?) or should we do that later?

gipert commented 3 years ago

It's impossible to get it with uproot3 :(

oschulz commented 3 years ago

It's impossible to get it with uproot3 :(

Ah, sorry, I misunderstood! Ok, I'll merge this, then. :-)

gipert commented 3 years ago

Do you know the semantics of that? StatsBase.Histogram has several normalization modes (mainly density-normalization and full normalization), both take non-homogenous binning into account. I don't know that ROOT's kIsAverage does, exactly.

In my understanding a ROOT histogram with kIsAverage is interpreted as a density. When you TH1:Add two of them the bin contents are interpreted as counts normalized by the bin width (I'm not 100% percent tho, I should dig into the code).

Looks like uproot3 is not so clever and does not save this info in the TH1 python object... uproot4 does! I suggest that we leave these two things in the long-term todo list:

oschulz commented 3 years ago

https://github.com/JuliaRegistries/General/pull/38141

oschulz commented 3 years ago

I suggest that we leave these two things in the long-term todo list

Yep - can you open an issue for that, so we don't forget?