larray-project / larray

N-dimensional labelled arrays in Python
https://larray.readthedocs.io/
GNU General Public License v3.0
8 stars 6 forks source link

add an option in Array.set_labels to ignore missing labels #1090

Open gdementen opened 8 months ago

gdementen commented 8 months ago

When applying the same map (for example to translate several arrays for final reporting), it is very tedious to have to compute the correct mapping for each array (intersect the global mapping keys with the axis labels).

>>> arr = ndtest(3)
>>> arr.set_labels({'a1': 'A1', 'a3': 'A3'})
ValueError: 'a3' is not a valid label for any axis:
 a [3]: 'a0' 'a1' 'a2'

Since this could hide errors and it sort-of-conflicts with #906, we cannot do this by default. Having an option would be nice though.

gdementen commented 8 months ago

FWIW, I needed this for dc2019

gdementen commented 1 month ago

I needed this for the promes project too.

Here is some preliminary code to do so:

def set_labels(arr: la.Array, labelmaps: dict) -> la.Array:
    def new_labels(axis: la.Axis, labelmap: dict) -> list:
        return [labelmap.get(label, label) for label in axis.labels]

    return arr.set_labels({axis: new_labels(axis, labelmaps[axis.id])
                           for axis in arr.axes if axis.id in labelmaps})

In dc2024 (and possibly 2019), I used different code to be able to rename the axis at the same time, by using a special __name__ label but I am unsure it is a good idea to support this in LArray itself.

Test code:

>>> arr = ndtest((2, 3))
>>> set_labels(arr, {'b': {'b1': 'B1', 'b3': 'B3'}, 'c': {'c1': 'C1'}})
a\b  b0  B1  b2
 a0   0   1   2
 a1   3   4   5