Add dedicated easy-to-use data cloning feature

asgersvenning / pyremotedata

This repository contains the python module "pyRemoteData" which handles high-bandwidth data transfer with LFTP.

https://asgersvenning.github.io/pyremotedata/

MIT License

3 stars 1 forks source link

Add dedicated easy-to-use data cloning feature #2

Open asgersvenning opened 9 months ago

asgersvenning commented 9 months ago

Currently the high-level features of this repository focus on on-the-fly dataloading, which I think is absolutely necessary for the largest of datasets (multidigit terabyte sizes). However, in most cases datasets will probably be much smaller, where it would be easier and faster to clone the entire dataset to local storage before proceeding with some pipeline.

asgersvenning commented 9 months ago

This should be implemented in src/pyremotedata/implicit_mount.py : implicit_mount.clone(), but has not beed tested yet.