Closed snoyberg closed 6 years ago
I'd imagine we'll need to have an API which takes a Source
which is run twice, or alternatively a Source
together with the checksum and size.
I checked memory usage of tar
for large files, and was surprised it never took more than 3MB. I looked at what it did, and it turns out it will not be a problem to compute it in constant space:
The chksum field is the ASCII representation of the octal value of the simple sum of all bytes in the header block.
(from the gnu doc).
@snoyberg We already can create tar files and handle the checksum properly, so if I understand this ticket correctly, we can go ahead and close it, right?
Good catch, thanks @lehins.
While I can see how to handle the checksum problem for consuming a tar file (throwing an exception on the last chunk of an entry), I don't see how to handle it in that case. The size field has a similar problem. It seems to me it will require unbounded amounts of memory!