ssl-hep / ServiceX

ServiceX - a data delivery service pilot for IRIS-HEP DOMA
BSD 3-Clause "New" or "Revised" License
19 stars 21 forks source link

Failed transformation due to XRootD error: [ERROR] Operation expired #670

Open kyungeonchoi opened 9 months ago

kyungeonchoi commented 9 months ago

Describe the bug Intermittent XRootD error causing ServiceX transformation to fail. I guess it happens for the files not cached (XCache). The Rucio DID (mc21_13p6TeV.601190.PhPy8EG_AZNLO_Zmumu.deriv.DAOD_PHYSLITE.e8453_s3873_r13829_p5631 - ATLAS only) contains about 1000 files and amounts to 2TB. Files are distributed across three sites: PRAGUELCG2_DATADISK: 651 files, NDGF-T1_DATADISK: 402 files, CA-SFU-T2_DATADISK: 25 files. Asked to deliver 13 columns using python transformer. 580/1,078 failed from the first trial and 10/1,078 failed from the second trial. All 10 failed files are located at PRAGUELCG2_DATADISK. The failed file can be downloaded via xrdcp.

Screenshots Error message:

Failed to transform input file root://xcache.af.uchicago.edu:1094//root://ftp1.ndgf.org:1094//atlas/disk/atlasdatadisk/rucio/mc21_13p6TeV/85/6d/DAOD_PHYSLITE.33080397._000040.pool.root.1: XRootD error: [ERROR] Operation expired
in file root://xcache.af.uchicago.edu:1094//root://ftp1.ndgf.org:1094//atlas/disk/atlasdatadisk/rucio/mc21_13p6TeV/85/6d/DAOD_PHYSLITE.33080397._000040.pool.root.1

Files processed for 2 hours - 1st trial (top), 2nd trial (bottom)

xrootd_error_1st xrootd_error_2nd
ponyisi commented 1 month ago

This should be improved a lot in the uproot 5.3.9-based transformations but we should stay aware