Open c0xc opened 4 months ago
Why do you use ZipInfo
instead of a string if you do not set its attributes?
Why do you use
ZipInfo
instead of a string if you do not set its attributes?
That would cause the (streamed) file to be archived with a timestamp of 1980, which is the first bug. See my example: https://gist.github.com/c0xc/b54c005b296cdf6378ce65dd4aff3fe7#file-ztest-py-L12
zip
uses the current timestamp when add a file from a streamed input. So I think that it is reasonable to do the same in zipfile
.
As for using the archive default compression by default, this is more complex question.
zip
uses the current timestamp when add a file from a streamed input. So I think that it is reasonable to do the same inzipfile
.
Exactly and that's what my suggested change does, see: https://github.com/python/cpython/pull/121405/files#diff-7629293618f2b3cf8ae7daf98526226fa12f047d71645a05497e6687aae10c76R413
As for using the archive default compression by default, this is more complex question.
Please note that the second bug I'm trying to fix is not merely using a default compression (I don't want to change the default behavior) but what happens is that the programmer explicitly specifies the compression (line 8) which is then ignored when adding files (lines 20 and 26).
I believe in almost all cases, using the current timestamp makes more sense, I rarely zip files from 1980.
Unfortunately, embedded timestamps are the most common problem for Reproducible Builds and changing this default might have unintended consequences.
Bug report
Bug description:
When streaming, writing to a zip file, it internally creates a ZipInfo object without explicitly setting date_time, causing a timestamp of 1980 to be used. I believe in almost all cases, using the current timestamp makes more sense, I rarely zip files from 1980. To work around that issue, I create the ZipInfo object myself and pass it to the open() method, which causes another bug where the specified compression is ignored and the file is simply stored.
After some minor investigation, I've found issue #113971 which at least hinted towards what I now use as workaround but otherwise didn't fix this:
So you'd create a new ZipFile specifying a
compression
but afteropen()
, you're just storing uncompressed:In other words, I expect that: a) when using open("file"...), it should use the current time as mtime by default b) when creating a ZipInfo object to specify another date or another setting for the new file, it should inherit the previously configured compression Both assumptions are wrong, which is what I'm suggesting to fix.
Here's a reproducer and I'll add a PR as suggested fix... https://gist.github.com/c0xc/b54c005b296cdf6378ce65dd4aff3fe7
CPython versions tested on:
3.11, CPython main branch
Operating systems tested on:
Linux
Linked PRs