Open rabernat opened 3 years ago
Thanks for bringing these up! In particular, the sync example using Hadoop DistCp looks like it could be particularly useful, although there are a lot of limiting factors:
I think that as it is, treeverse-distcp could be a useful tool for Pangeo Forge in making recipes to copy/move data across cloud providers, and understanding how to take something like this and apply it to GCP Dataflow or other cloud batch processing services could be useful in the long run.
These tools looks really cool