Closed marinang closed 5 years ago
Some users have been using TH2s; have they been looking at diagonally flipped plots? We have to somehow warn them before changing their interface.
@nsmith- made some edits and @alasfar-lina raised some issues about TH2 in the past.
This pull request, in addition to adding TH3 (thanks!) swaps the x and y axes for TH2, presumably because they were wrong before. Unit tests don't go all the way to plotting, so that's entirely possible. Do you two find this to be the case, and would you be okay with a breaking API that fixes it?
Interesting, we did notice some oddities in the ordering, but I think ROOT is to blame for it being backwards. I will confirm today
Creating a 2D hist:
import ROOT
f = ROOT.TFile("test.root", "recreate")
h = ROOT.TH2D("hist", "test;x axis;y axis", 3, 0, 3, 4, 0, 4)
h.Fill(1,2)
h.Write()
and reading with uproot:
import uproot
f = uproot.open("test.root")
h = f['hist']
print("Edges:", h.edges)
print("Values:", h.values)
print("Values at (y,x) = (2,1):", h.values[2,1])
print("X title:", h._fXaxis._fTitle)
prints:
Edges: (array([0., 1., 2., 3.]), array([0., 1., 2., 3., 4.]))
Values: [[0. 0. 0.]
[0. 0. 0.]
[0. 1. 0.]
[0. 0. 0.]]
Values at (y,x) = (2,1): 1.0
X title: b'x axis'
So the h.values
array is the correct shape for (y,x)
ordering. The array h.values.reshape(3,4)
is invalid.
Hence there is an inconsistency between the tuple h.edges
and the 2D array h.values
. Personally I'm in favor of h.values.T
rather than yedges, xedges = h.edges
, but either way a breaking change should be made.
That's odd. What about TH3 then is it (z, y, x) ? I will have a look.
I think so, recalling how tree->Draw("z:y:x >> hist")
works... ROOT does strange things sometimes :)
I have just checked the result of you example, and the bin with 1.0 is now an y overflow bin ...
>>> np.array(h[:]).reshape(5, 6)
array([[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.]])
>>> h.yoverflows
array([0., 0., 1., 0., 0.])
I will make some adjustments to get the correct ordering from ROOT and provide a (x, y, (z)) ordering in values and edges (ect...). Actually your example could be added as a test.
I put [WIP]
in the name to prevent this from being accidentally merged. When you're certain that you've put things in the right order, remove the [WIP]
and I'll accept whatever conclusions you've come to.
About (contents, xbins, ybins)
vs (contents, [xbins, ybins])
, I implemented one because it matches what numpy.histogram2d
returns. However, the other is what numpy.histogramdd
returns (with a nested list, not type, I believe). If Numpy is deprecating or at least downplaying numpy.histogram2d
—I get that impression from some docs, but I'm not sure—then perhaps we should follow the d-dimensional signature.
I just hate switching again it on unsuspecting users. It was already the cause of one big report, and who knows how many people it affected without writing bug reports? If we're referring the order of the axes, that's just as bad and we should take this opportunity to put it in its final state.
Ok as for now the order of the axes is (x, y, (z)) for TH2 and TH3.
You are talking about the numpy
method right ? I think the output formats should be identical for both TH2 and TH3 and I personally prefer this (contents, [xbins, ybins, zbins])
one.
Consistency between TH2 and TH3 is a good argument; it could be why Numpy is quietly hiding numpy.histogram2d
(not listed in "see also" from numpy.histogram
; only the d-dimensional case is listed).
@alasfar-lina, are you okay with this change? It would essentially revert what you saw in #36. If I could reach everybody I thought was using this, I would try, but the issues are all I have to go with.
(An announcement on the front page probably wouldn't do any good—it would affect people's workflows who have already downloaded uproot and set up their scripts. They're not going to be looking there.)
Is this done? Ready to be merged?
No I was waiting for an answer from @alasfar-lina. I can apply the changes now and tell when it's ready.
@jpivarski done!
I'm glad I see these PRs now. Need to update https://github.com/CoffeaTeam/fnal-column-analysis-tools.
It just went into PyPI as uproot-methods 0.4.3 a minute ago.
Creation of TH3.py as discussed in #8. Modification of TH2.py, values was shaped as (ynumbins + 1 , xnumbins + 1) whereas it should be (xnumbins + 1 , ynumbins + 1) (right ?).