iris-hep / idap-200gbps-atlas

benchmarking throughput with PHYSLITE
6 stars 1 forks source link

uproot/asyncio hide real error as a timeout #114

Open gordonwatts opened 3 months ago

gordonwatts commented 3 months ago

This error is reproducable, and appears not to be an actual timeout error:

(venv) [bash][gwatts]:idap-200gbps-atlas > python servicex/servicex_materialize_branches.py -v --distributed-client scheduler --dask-scheduler 'tcp://dask-gwatts-2e1782e2-0.af-jupyter:8786' --dask-profile --dataset mc_1TB --query xaod_medium --num-files 0
0000.0468 - INFO - root - Using release 22.2.107 for type information.
0000.0818 - WARNING - func_adl.type_based_replacement - Unknown type for name len
0000.8424 - INFO - root - Running over 1 datasets, 1.222 TB and 136458000 events.
0000.8429 - INFO - root - Building ServiceX query
0000.8433 - INFO - root - Querying dataset mc20_13TeV:mc20_13TeV.364157.Sherpa_221_NNPDF30NNLO_Wmunu_MAXHTPTV0_70_CFilterBVeto.deriv.DAOD_PHYSLITE.e5340_s3681_r13145_p6026
0000.8433 - INFO - root - Running on the full dataset(s).
0000.8434 - INFO - root - Starting ServiceX query
0000.8565 - INFO - servicex.servicex_client - Returning code generators from cache
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?  
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?  0002.7905 - INFO - servicex.query - ServiceX Transform speed_test_mc20_13TeV:mc20_13TeV.364157.Sherpa_221_NNPD
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?            
Transform     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/?            
Download/URLs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1211/1211 03:54
0497.0524 - INFO - root - Event rate for ServiceX: 00:08:16 time, 275.00 kHz, Data rate: 19.70 Gbits/s
0497.0525 - INFO - root - Dataset speed_test_mc20_13TeV:mc20_13TeV.364157.Sherpa_221_NNPDF30NNLO_Wmunu_MAXHTPTV0_70_CFilterBVeto.deriv.DAOD_PHYSLITE.e5340_s3681_r has 1184 files
Traceback (most recent call last):
  File "/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 56, in _runner
    result[0] = await coro
  File "/venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 234, in _cat_file
    async with session.get(self.encode_url(url), **kw) as r:
  File "/venv/lib/python3.9/site-packages/aiohttp/client.py", line 1197, in __aenter__
    self._resp = await self._coro
  File "/venv/lib/python3.9/site-packages/aiohttp/client.py", line 608, in _request
    await resp.start(conn)
  File "/venv/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 991, in start
    self._continue = None
  File "/venv/lib/python3.9/site-packages/aiohttp/helpers.py", line 735, in __exit__
    raise asyncio.TimeoutError from None
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/gwatts/code/iris-hep/idap-200gbps-atlas/servicex/servicex_materialize_branches.py", line 506, in <module>
    main(
  File "/home/gwatts/code/iris-hep/idap-200gbps-atlas/servicex/servicex_materialize_branches.py", line 196, in main
    report, n_events = dask.compute(*calculate_n_events(dataset_files, steps_per_file))
  File "/venv/lib/python3.9/site-packages/dask/base.py", line 661, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/venv/lib/python3.9/site-packages/uproot/_dask.py", line 1316, in __call__
    (result, counters), duration = with_duration(self._call_impl)(
  File "/venv/lib/python3.9/site-packages/uproot/_dask.py", line 1154, in wrapper
    result = f(*args, **kwargs)
  File "/venv/lib/python3.9/site-packages/uproot/_dask.py", line 1268, in _call_impl
    ttree = uproot._util.regularize_object_path(
  File "/venv/lib/python3.9/site-packages/uproot/_util.py", line 964, in regularize_object_path
    file = ReadOnlyFile(
  File "/venv/lib/python3.9/site-packages/uproot/reading.py", line 573, in __init__
    self._begin_chunk = self._source.chunk(
  File "/venv/lib/python3.9/site-packages/uproot/source/fsspec.py", line 98, in chunk
    data = self._fs.cat_file(self._file_path, start=start, end=stop)
  File "/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 118, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 101, in sync
    raise FSTimeoutError from return_result
fsspec.exceptions.FSTimeoutError