scikit-hep / uproot3-methods

Pythonic behaviors for non-I/O related ROOT classes.
BSD 3-Clause "New" or "Revised" License
21 stars 28 forks source link

Multidimensional TLorentzVectorArray doesn't work #68

Closed beojan closed 4 years ago

beojan commented 4 years ago

I can create them:

x = np.ones((5,2))
y = np.ones((5,2))
z = np.ones((5,2))
t = 5*np.ones((5,2))
arr = TLorentzVectorArray(x,y,z,t)

And print their mass:

arr.mass
[[4.69041576 4.69041576]
 [4.69041576 4.69041576]
 [4.69041576 4.69041576]
 [4.69041576 4.69041576]
 [4.69041576 4.69041576]]

But I can't print the array itself:

print(arr)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-59-e3eb363b4781> in <module>
----> 1 print(arr)

/usr/lib/python3.7/site-packages/awkward/array/base.py in __str__(self)
     96     def __str__(self):
     97         if len(self) <= 6:
---> 98             return "[{0}]".format(" ".join(self._util_arraystr(x) for x in self.__iter__(checkiter=False)))
     99 
    100         else:

/usr/lib/python3.7/site-packages/awkward/array/base.py in <genexpr>(.0)
     96     def __str__(self):
     97         if len(self) <= 6:
---> 98             return "[{0}]".format(" ".join(self._util_arraystr(x) for x in self.__iter__(checkiter=False)))
     99 
    100         else:

/usr/lib/python3.7/site-packages/awkward/array/objects.py in __iter__(self, checkiter)
    176             self._checkiter()
    177         for x in self._content:
--> 178             yield self.generator(x, *self._args, **self._kwargs)
    179 
    180     def __getitem__(self, where):

/usr/lib/python3.7/site-packages/uproot_methods/classes/TLorentzVector.py in <lambda>(row)
    124 class ArrayMethods(Common, uproot_methods.base.ROOTMethods):
    125     def _initObjectArray(self, table):
--> 126         self.awkward.ObjectArray.__init__(self, table, lambda row: TLorentzVector(row["fX"], row["fY"], row["fZ"], row["fE"]))
    127 
    128     def __awkward_serialize__(self, serializer):

/usr/lib/python3.7/site-packages/uproot_methods/classes/TLorentzVector.py in __init__(self, x, y, z, t)
    897 class TLorentzVector(Methods):
    898     def __init__(self, x, y, z, t):
--> 899         self._fP = uproot_methods.classes.TVector3.TVector3(float(x), float(y), float(z))
    900         self._fE = float(t)
    901 

TypeError: only size-1 arrays can be converted to Python scalars

Or add along only the inner axis, since sum takes no axis argument.

jpivarski commented 4 years ago

Yeah, I see: it's thinking that an individual element is one element of the underlying Numpy array (i.e. what you get when you say array[n] for some integer n), but since the Numpy array is two-dimensional, array[n] gives you an array, and that can't be passed into float (hence the error message).

In the Awkward 1.0 project, I'm erasing the difference between Awkward arrays and Numpy arrays at a structural level, to fix all of this kind of bug once and for all.

For now, though, can you work with jagged arrays of TLorentzVector? Even though you know that you'll always have 2 Lorentz vectors in each element, if you technically make it jagged, it will work. (It's a case that many people have tested.) Instead of creating x, y, z, t directly with np.ones, pass the np.ones through awkward.fromiter (slow) or build the jagged array manually using JaggedArray.fromcounts(np.full(N, 2, dtype=int), content) (fast).

beojan commented 4 years ago

That helped. I used JaggedArray.fromregular.

Unfortunately, JaggedArray can't slice in more than two dimensions (I actually have four), but I can do what I'm doing now since I can flatten out most of these dimensions.

jpivarski commented 4 years ago

Oh, the irony:

image

I just finished arbitrarily deep slicing in the new Awkward and consider it one of the motivators for this project. In pure Numpy, each level of slicing depth adds complexity, particularly because of Numpy's weird rules for advanced indexing. Without having to express everything in Numpy calls, it can be a recursive procedure that's completely self-similar and still vectorizable.

I hadn't known that anyone using awkward-array was actually running into this two-level slicing constraint—I never heard any feedback about it. However, I was pretty uncomfortable with the artificiality of the limitation.