scikit-hep / awkward-0.x

Manipulate arrays of complex data structures as easily as Numpy.
BSD 3-Clause "New" or "Revised" License
215 stars 39 forks source link

Bug in string comparison in StringArray #247

Open dntaylor opened 4 years ago

dntaylor commented 4 years ago

There is a bug in StringArray when comparing strings:

>>> import awkward
>>> a = awkward.fromiter(['a','b','c'])
>>> a == 'a'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/dntaylor/miniconda3/envs/test/lib/python3.7/site-packages/numpy/lib/mixins.py", line 25, in func
    return ufunc(self, other)
  File "/Users/dntaylor/miniconda3/envs/test/lib/python3.7/site-packages/awkward/array/objects.py", line 331, in __array_ufunc__
    right = self.StringArray.fromstr(len(left), right)
  File "/Users/dntaylor/miniconda3/envs/test/lib/python3.7/site-packages/awkward/array/objects.py", line 383, in fromstr
    for i, x in string:
TypeError: cannot unpack non-iterable int object
>>> a
<StringArray ['a' 'b' 'c'] at 0x000102c01390>
>>> awkward.__version__
'0.12.21'

I guess it needs either enumerate or to remove the "i" (since it isn't used).

jpivarski commented 4 years ago

Thanks for pointing out this bug!

It does look easy enough to fix, but at the same time, it would be a good idea to consider moving to Awkward 1 (https://github.com/scikit-hep/awkward-1.0). The string implementation is considerably improved and you can use both in the same script because the package names are different. (In particular, I remember explicitly dealing with the case of broadcasting a string as a scalar to all elements in an array of strings—I doubt the old one would do it right, even with this fix.)

I'll be advertising the transition more after Uproot supports the new one, but it's usable not with ak.from_awkward0 and ak.to_awkward0.