scikit-hep / hepconvert

BSD 3-Clause "New" or "Revised" License
8 stars 1 forks source link

feature request: Local caching of remote files #72

Open NJManganelli opened 4 months ago

NJManganelli commented 4 months ago

Network performance can be a nuisance in converting files. One option is to include a feature that will temporarily cache a whole file to a TMPDIR/scratch space, then e.g. do the conversion locally (permitting that file to be evicted naturally from the tmp space instead of having the user explicitly download and delete after conversion). A similar feature may be desirable for writing output as well; transferring a complete file being more resilient than iteratively writing over a narrow network pipe.

An older root-oriented example of doing this (fsspec etc. probably changes the scope/picture of doing this... in fact, https://filesystem-spec.readthedocs.io/en/latest/features.html#caching-files-locally): (NanoAOD tools PostProcessor)

(simple http copy) https://github.com/iris-hep/analysis-grand-challenge/blob/6458376a2d41b84e3290df4a03922d23d768f484/analyses/cms-open-data-ttbar/utils/file_input.py#L117