scikit-hep / uproot5

ROOT I/O in pure Python and NumPy.
https://uproot.readthedocs.io
BSD 3-Clause "New" or "Revised" License
235 stars 75 forks source link

XRootDResource relies on querying parameters which is not supported by all storage elements #57

Closed nikoladze closed 4 years ago

nikoladze commented 4 years ago

get_server_config queries some parameters for vector reading

https://github.com/scikit-hep/uproot4/blob/73d103dd5588bdf478937e475007fabd1a5803ec/uproot4/source/xrootd.py#L32-L41

not all storage elements seem to be reporting this correctly (seems those that use dCache) - they just report back the name of the parameter, e.g.

>>> import XRootD.client
>>> fs = XRootD.client.FileSystem("root://prometheus.desy.de:1094/")
>>> fs.query(XRootD.client.flags.QueryCode.CONFIG, "readv_iov_max")
(<status: 0, code: 0, errno: 0, message: '[SUCCESS] ', shellcode: 0, error: False, fatal: False, ok: True>, b'readv_iov_max\n')

Unfortunately i can't give a minimal reproducer with opening a root file since i couldn't find publicly accessible root files on any of these storages, but essentially it breaks when trying to convert this value to an int (the follwing works with access to ATLAS VO)

>>> import uproot4
>>> f = uproot4.open("root://lcg-lrz-rootd.grid.lrz.de:1094/pnfs/lrz-muenchen.de/data/atlas/dq2/atlaslocalgroupdisk/rucio/data15_13TeV/f4/ba/DAOD_PHYSLITE.21568620._000001.pool.root.1")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/nikolai/python/uproot4/uproot4/reading.py", line 80, in open
    file = ReadOnlyFile(
  File "/home/nikolai/python/uproot4/uproot4/reading.py", line 139, in __init__
    self._source = Source(file_path, **self._options)
  File "/home/nikolai/python/uproot4/uproot4/source/xrootd.py", line 182, in __init__
    self._max_num_elements, self._max_element_size = get_server_config(file_path)
  File "/home/nikolai/python/uproot4/uproot4/source/xrootd.py", line 37, in get_server_config
    readv_iov_max = int(readv_iov_max)
ValueError: invalid literal for int() with base 10: b'readv_iov_max\n'

So some kind of fallback (default value that is configurable if needed?) would be needed to support these storages for now.

chrisburr commented 4 years ago

It turns out this is caused by querying a redirector instead of the data server itself. I've found a similar setup within LHCb which I can use for testing a solution.