epam / cloud-pipeline

Cloud agnostic genomics analysis, scientific computation and storage platform
https://cloud-pipeline.com
Apache License 2.0
145 stars 58 forks source link

`pipe storage cp` shall start data upload before traversing full source hierarchy #2574

Open sidoruka opened 2 years ago

sidoruka commented 2 years ago

Background At the moment, pipe storage cp/mv CLI command for --recursive operations runs (at least) in two phases:

For huge filesystem hierarchies the scan process may take a lot of time (hours) and consume a lot of memory.

Approach It would be great to change the scanning procedure in a more asynchronous fashion, e.g.:

sidoruka commented 2 years ago

Backport to release/0.16