dotmesh-io / dotmesh

dotmesh (dm) is like git for your data volumes (databases, files etc) in Docker and Kubernetes
https://dotmesh.com
Apache License 2.0
538 stars 29 forks source link

[EXTRA] tar endpoint from a directory in a snapshot #639

Closed lukemarsden closed 5 years ago

lukemarsden commented 5 years ago

User requirement

As a dotmesh API consumer, I want to download a tarball of a directory in a snapshot of a datadot so that I can download many files in a single API call.

Key acceptance criteria

rusenask commented 5 years ago

how big can these files be?

lukemarsden commented 5 years ago

arbitrarily large

streaming the output of tar cf from a snapshot mount at a given subpath back to the user would be one way to do it memory efficiently

rusenask commented 5 years ago

The same strategy can be used as we have for individual files, writing directly to the client connection. Just the actual archive creation should be done in a temp dir and not memory

lukemarsden commented 5 years ago

why double the amount of disk space used and add cleanup complexity when we can just stream the output of tar cf back to the user?

rusenask commented 5 years ago

yeah, good point, tar writer can do that

alaric-dotmesh commented 5 years ago

https://godoc.org/archive/tar looks nice and easy to use, rather than shelling out to tar.