scikit-hep / uproot5

ROOT I/O in pure Python and NumPy.
https://uproot.readthedocs.io
BSD 3-Clause "New" or "Revised" License
233 stars 73 forks source link

ROOT Map() showing errors for any written TTrees with TBaskets #931

Open ioanaif opened 1 year ago

ioanaif commented 1 year ago

An issue was introduced in between uproot 4.1.1 and 4.1.5, where ROOT .Map() shows errors for any TTrees with TBaskets (where the TBaskets can be empty or not).

>>> import uproot
>>> import ROOT
>>> import numpy as np
>>> uproot.__version__
5.0.10
>>> file = uproot.recreate("simple_reproducer.root")
>>> file["tree"] = {"some_array": np.array([1,2,3])}
>>> rf = ROOT.TFile("simple_reproducer.root")
>>> rf.Map()
20230810/162540  At:100     N=142       TFile                     
20230810/162557  At:242     N=101       TBasket                   
Address = 343   Nbytes = 0  =====E R R O R=======
0/000000  At:343     N=1         END           
ioanaif commented 7 months ago

I looked into this error and attempted to fix it. I didn’t find the fix, but I did isolate where the issue is coming from. Thus, I will write here my findings. 




The problem is in writable.py:

https://github.com/scikit-hep/uproot5/blob/c128e4bebdd5aad2bfaa5bc5f4ce896fe7bd9e72/src/uproot/writing/writable.py#L1323-L1334

If we only have L1333, i.e. uproot.models.TTree.Model_TTree_v20 in the list of models, then the Map() issue disappears.

Thus, the issue comes from some of the streamers in TLeaf and TBranch. In each TLeaf type the issue goes away if we comment out this part:

https://github.com/scikit-hep/uproot5/blob/c128e4bebdd5aad2bfaa5bc5f4ce896fe7bd9e72/src/uproot/models/TLeaf.py#L185-L192

Where _rawstreamer_TLeaf_v2 is defined here:

https://github.com/scikit-hep/uproot5/blob/c128e4bebdd5aad2bfaa5bc5f4ce896fe7bd9e72/src/uproot/models/TLeaf.py#L14-L19

As for TBranch, the issue is fixed when commenting out lines 562 to 568:

https://github.com/scikit-hep/uproot5/blob/c128e4bebdd5aad2bfaa5bc5f4ce896fe7bd9e72/src/uproot/models/TBranch.py#L559-L569