larray-project / larray

N-dimensional labelled arrays in Python
https://larray.readthedocs.io/
GNU General Public License v3.0
8 stars 6 forks source link

String syntax does not work for single integer groups #1080

Open gdementen opened 9 months ago

gdementen commented 9 months ago

For text labels, it works nicely:

>>> arr = ndtest(3)
>>> arr
a  a0  a1  a2
    0   1   2
>>> arr.sum('a1;a0,a2')
a  a1  a0,a2
    1      2

When the labels are integers, though, the string syntax falls short:

>>> arr = ndtest("a=0,1,2")
>>> arr
a  0  1  2
   0  1  2
>>> arr.sum('1;0,2')
ValueError: '1' is not a valid label for any axis:
 a [3]: 0 1 2

This is almost not a bug, because it all comes down to me refusing to make the following work (because in the case of int-like labels, it would prevent targeting a single label):

>>> arr['1']
ValueError: '1' is not a valid label for any axis:
 a [3]: 0 1 2
>>> arr.sum('1')
ValueError: '1' is not a valid label for any axis:
 a [3]: 0 1 2

I am still not ready to lift this restriction (even though int-like labels are a bad idea and are not well supported by the string syntax anyway), at least until we have some way to express them (probably related to #34).

However, I think it would be reasonable to support single elements within "aggregation" strings, which we are already interpreting (split on ;). The above case does not work just because we first split on ; then determine whether this must be interpreted.

FWIW, the current workaround is either to avoid using the string syntax or use a sequence of one element instead:

>>> arr['1,']
a  1
   1
>>> arr.sum('1,;0,2')
a  [1],  0,2
      1    2

but this is not obvious at all.

BTW: the default label is ugly in this case. I think it should be '1,' instead of '[1],', but this should be a separate issue.

gdementen commented 8 months ago

did not make it in 0.34.2, as I am in a hurry to release it. Besides it is a backward-incompatible fix (some could rely on the current behaviour), so doing this in a patch release is kinda bad.