scikit-hep / uproot5

ROOT I/O in pure Python and NumPy.
https://uproot.readthedocs.io
BSD 3-Clause "New" or "Revised" License
224 stars 69 forks source link

Task call time tracking for successful tasks with `uproot.dask` #1211

Open alexander-held opened 1 month ago

alexander-held commented 1 month ago

The allow_read_errors_with_report option in uproot.dask is convenient not only to understand issues but also to study performance. While the duration seems to always get filled, I see call_time with None values for successful tasks. Unless this would be a performance issue, I believe that always (or optionally, controlled in some manner) filling this value would be useful. I would be particularly interested in using this to calculate instantaneous data rates (as this opens up the possibility to calculate data read by each task, alongside all required time stamps).

alexander-held commented 1 month ago

I also am wondering if the kwarg name for this is possibly confusing: the report can be useful even when read errors are not a concern but the kwarg makes it sound very focused on the "read error" part and not so much on the "report" part. It might not be the most obvious setting to find for people interested in more information.