scikit-hep / awkward-0.x

Manipulate arrays of complex data structures as easily as Numpy.
BSD 3-Clause "New" or "Revised" License
215 stars 39 forks source link

Failed to encode persistentvirtual LazyFiles #220

Closed vargasa closed 4 years ago

vargasa commented 4 years ago

The following code works as expected:

import uproot
import awkward

f = uproot.open("xfiles.root")
t = f["Events"]

awk = t.lazyarrays(persistvirtual=True)
cut = awk[awk.nMuon >= 2]
awkward.save("muonCut.awkd", cut, mode="w")

However, if I use uproot.lazyarrays directly:

import uproot
import awkward

awk = uproot.lazyarrays("xfiles.root",
                      treepath="Events",
                      persistvirtual=True)

cut = awk[awk.nMuon >= 2]
print(cut)

awkward.save("muonCut.awkd", cut, mode="w")

The second piece yields to:

[<Row 0> <Row 2> <Row 6> ... <Row 1152947> <Row 1152952> <Row 1152953>]

which is expected. However when calling awkward.save I get:

TypeError: failed to encode <uproot.tree._LazyFiles object at 0x7fda9b511908> (type: <class 'uproot.tree._LazyFiles'>)

The second piece is intended to be used for chaining some files instead of using only one as in the code shown. Is the virtualness expected to work in this case?

jpivarski commented 4 years ago

This is a bug: it looks like _LazyTree has correctly implemented persistence and _LazyFiles doesn't. I'll take a look.

jpivarski commented 4 years ago

In the latest uproot, _LazyTree fails, and _LazyFiles fails differently! So, first off, update to the latest uproot because something changed. I'll dig into the error, though.

jpivarski commented 4 years ago

Actually, I don't see anything in here that would enable persistence of lazily loaded trees or sets of trees from uproot. As far as I can see, it just hasn't been implemented, and there are a few subtleties that we'd have to solve to do it right.

This also bothers me because I remember saving lazy data from uproot. I don't remember when, but I see no evidence that it has been implemented (Mandela Effect?). For the moment, however, I think you should consider this not an existing feature.