iris-hep / analysis-grand-challenge

Repository dedicated to AGC preparations & execution
https://agc.readthedocs.io
MIT License
24 stars 39 forks source link

NanoAOD file access over http and performance #128

Open alexander-held opened 1 year ago

alexander-held commented 1 year ago

ServiceX transforms of NanoAOD files and direct uproot-based access via http seem to be slower than for ntuples: https://gist.github.com/alexander-held/4e58811522ed9990afb2d4b73ef9471e.

@masonproffitt pointed out an XRootD issue related to this: https://github.com/xrootd/xrootd/issues/1976. Reading too much data causes a 500 error and uproot subsequently falls back to individual requests, making everything slower. A similar issue is https://github.com/xrootd/xrootd/issues/2003: this is about requesting too many ranges at once, while the former is about requesting too many bytes in a range.

Related uproot issue during these investigations: https://github.com/scikit-hep/uproot5/issues/881.

Impact on ServiceX

More details about the behavior of ServiceX from @masonproffitt:

the uproot backend does not set anything related to chunking; it just uses the default settings for uproot. the problems are a bit different between uproot4 (used in the current version of the servicex uproot transformer) and uproot5. in uproot4, the main problem is that uproot.lazy has an explicit iterator over branches, so the execution time scales linearly with both the number of branches accessed and the round trip latency. in uproot5, this problem should disappear thanks to uproot.dask, but there the issue is that it hits these xrootd limits and falls back to individual requests (at least for each branch, maybe even for each basket)

for uproot5, we can set the step_size in the servicex transformer, but i don't think there's a consistent way to guarantee that we don't hit these limits because there are separate limits for (1) number of byte ranges, (2) total ascii length of the Range field, and (3) total number of actual bytes requested by Range. the problem is that there's no way to know the number and size of the baskets before the code executes. handling this would require either going deep into uproot itself or inspecting a lot of metadata at runtime and modifying the generated code in very non-trivial ways

Impact on coffea

It is currently unclear if this would affect coffea directly ingesting the input dataset differently. Are there any tricks that may matter here @nsmith- @lgray? Currently we are still using "old" coffea, though preparing to switch to coffea 2023.

masonproffitt commented 1 year ago

I just tested: the difference is simply the number of baskets in those files. The NanoAOD has 251 baskets per branch, and the ntuple has 10. Therefore you very quickly hit the 1024-byte-range XRootD limit for the NanoAOD but not for the ntuple.