scikit-hep / uproot5

ROOT I/O in pure Python and NumPy.
https://uproot.readthedocs.io
BSD 3-Clause "New" or "Revised" License
239 stars 76 forks source link

Writing TTrees to a file from a numpy array containing strings #1265

Closed Blue-EyesChaosMaxDragon closed 3 months ago

Blue-EyesChaosMaxDragon commented 3 months ago

Tested with uproot version 5.3.10. When trying to write numpy arrays to a TTree and one of the arrays contains strings (using the uproot.AsStrings interpretation), the code crashes. Here is a minimal example message to reproduce the error:

import uproot
import numpy as np

file = uproot.recreate("example.root")
file["DecayTree"] = {"x": np.array(["A","B"]), "y": np.array([1,2])}
file["DecayTree"].extend({"x": np.array(["A","B"]), "y": np.array([1,2])})

This code snippet produces the following error:

File ~/.local/lib/python3.11/site-packages/uproot/writing/_cascadetree.py:666, in Tree.extend(self, file, sink, data)
    664 if datum["counter"] is None:
    665     if datum["dtype"] == ">U0":
--> 666         lengths = numpy.asarray(awkward.num(branch_array.layout))
    667         which_big = lengths >= 255
    669         lengths_extension_offsets = numpy.empty(
    670             len(branch_array.layout) + 1, numpy.int64
    671         )

UnboundLocalError: cannot access local variable 'awkward' where it is not associated with a value
Blue-EyesChaosMaxDragon commented 3 months ago

As a workaround, it is possible to write the array with string to the TTree when switching to from numpy to awkward:

import uproot
import awkward as ak

file = uproot.recreate("example.root")
file["DecayTree"] = {"x": ak.Array(["A","B"]), "y": ak.Array([1,2])}
file["DecayTree"].extend({"x": ak.Array(["A","B"]), "y": ak.Array([1,2])})
jpivarski commented 3 months ago

Thanks for catching this! It should be fixed in #1266.