jjjake / internetarchive

A Python and Command-Line Interface to Archive.org
GNU Affero General Public License v3.0
1.63k stars 220 forks source link

Support streaming uploads #326

Closed JustAnotherArchivist closed 4 years ago

JustAnotherArchivist commented 4 years ago

It would be great if ia upload supported streaming uploads, e.g. reading from stdin or a FIFO, without using temporary files:

some-process-generating-lots-of-data | ia upload identifier -

See #77 and #173 for example use cases. A further idea is directly piping megawarc output to IA instead of writing it to disk and then re-reading it (multiple times), massively reducing disk I/O.

An implementation would, as far as I can tell, require server-side support on IA S3 for chunked transfer encoding and trailers. The former is needed to avoid having to know the size in advance, the latter for --verify and --delete by computing the MD5 hash during the upload and sending the Content-MD5 header at the end.

TheSneakySniper commented 4 years ago

Yeah, this would be a very useful feature.

jjjake commented 4 years ago

IA-S3 does not currently support streaming uploads, and there aren't any plans to as far as I know. I'm going to close the issue, but happy to revisit it if server-side support ever comes. If you have more questions about server-side support, please email info@archive.org. Thanks!