scikit-hep / uproot3

ROOT I/O in pure Python and NumPy.
BSD 3-Clause "New" or "Revised" License
314 stars 67 forks source link

Error reading back jagged int64 branches with uproot with default compression #506

Open masonproffitt opened 4 years ago

masonproffitt commented 4 years ago

I'm getting odd errors when trying to read back (in uproot) jagged branches with 64-bit integers written by uproot:

import uproot, awkward
with uproot.recreate('test.root') as file:
    file['tree'] = uproot.newtree({'branch': uproot.newbranch('int64', size='n')})
    file['tree'].extend({'branch': awkward.fromiter([[1, 1, 1], [1]]), 'n': [3, 1]})
uproot.open('test.root')['tree']['branch'].array()
Traceback (most recent call last):
  File "test.py", line 11, in <module>
    uproot.open('test.root')['tree']['branch'].array()
  File "/usr/lib/python3.8/site-packages/uproot/tree.py", line 1434, in array
    _delayedraise(fill(j))
  File "/usr/lib/python3.8/site-packages/uproot/tree.py", line 59, in _delayedraise
    raise err.with_traceback(trc)
  File "/usr/lib/python3.8/site-packages/uproot/tree.py", line 1402, in fill
    source = self._basket(i, interpretation, local_entrystart, local_entrystop, awkward, basketcache, keycache)
  File "/usr/lib/python3.8/site-packages/uproot/tree.py", line 1185, in _basket
    basketdata = key.basketdata()
  File "/usr/lib/python3.8/site-packages/uproot/tree.py", line 1692, in basketdata
    return self.cursor.copied().bytes(datasource, self._fObjlen)
  File "/usr/lib/python3.8/site-packages/uproot/source/cursor.py", line 54, in bytes
    return source.data(start, stop)
  File "/usr/lib/python3.8/site-packages/uproot/source/compressed.py", line 186, in data
    self._prepare()
  File "/usr/lib/python3.8/site-packages/uproot/source/compressed.py", line 157, in _prepare
    raise ValueError("unrecognized compression algorithm: {0}".format(algo))
ValueError: unrecognized compression algorithm: b'\x00\x00'

ROOT doesn't like the file either:

root [0] TFile f("test.root")
(TFile &) Name: test.root Title: 
root [1] tree->GetEntry(0)
Error R__unzip_header: error in header.  Values: 00
Error in <TBasket::ReadBasketBuffers>: Inconsistency found in header (nin=0, nbuf=0)
Error in <TBasket::ReadBasketBuffers>: fNbytes = 115, fKeylen = 73, fObjlen = 48, noutot = 32, nout=32, nin=0, nbuf=0
Error in <TBranch::GetBasket>: File: test.root at byte:13107, branch:branch, entry:0, badread=1, nerrors=1, basketnumber=0
(int) -1

Interestingly, it seems that it can be read (in both uproot and ROOT) if you set compression=None, which is why the test at https://github.com/scikit-hep/uproot/blob/634667fad826ec6c86e2df442887b1024c2cfee8/tests/test_write.py#L1931-L1946 doesn't fail.