thejoshwolfe / yauzl

yet another unzip library for node
MIT License
736 stars 80 forks source link

encrypted zip files should not have undefined behavior #11

Closed andrewrk closed 7 years ago

andrewrk commented 10 years ago

From the docs:

Currently, the presence of encryption is not even checked, and encrypted zip files will cause undefined behavior.

encrypted zip files should, at the very least, have well-defined behavior. for example, returning an error and not crashing.

Same with ZIP64 files. The docs say the behavior is undefined, but the behavior should be defined to give an error and not crash.

thejoshwolfe commented 8 years ago

Proposal: detect when an entry is encrypted, and require the option {decrypt: false} be passed to openReadStream() for that entry, otherwise some kind of error is raised. The decrypt option would be illegal for non-encrypted entries. {decrypt: false} means that the bytes you get in the read stream will be copied directly out of the zipfile without any processing. This fits nicely with the proposals in #38, and in fact an entry that is both encrypted and compressed would additionally require {decompress: false} be in the options to openReadStream().

This way, we have well defined behavior (the original goal of this issue), and we force clients to be aware of the encrypted state of their entries, and also be aware that yauzl is not going to help them decrypt anything. If someday yauzl did support decrypting entries, the {decrypt: false} syntax could still be the way that you ask for the encrypted file as-is without doing any decryption. Until then, clients are free to decide whether they want to reject encrypted entries or decrypt them using their own code.

I believe the number of bytes to assert in a compressed and encrypted file is neither the uncompressedSize nor the compressedSize, but rather compressedSize + 12. That might be wrong, but I think it's going to be something like that. We may want to provide a function like zipfile.getRawBytesLength(entry, {decompress:false, decrypt:false}) that does this math for you. We'll see.

thejoshwolfe commented 7 years ago

Just read the spec regarding the "Strong Encryption" methods introduced in version 6.2. Encrypting the central directory will (probably) cause an invalid signature error from yauzl when trying to read the first central directory record. Seems like good behavior.

Regarding per-file encryption, there are three modes:

Those are the values you get from the General Purpose Bit Field when masking against 0x41. The plan is to support ignoring "tranditional" encryption via {decrypt: false}, and emit an error for "strong" encryption (which will terminate the processing of the zipfile).

I'm mostly against caring about the "strong" encryption in zip files, because i've never heard of anyone actually using it, it seems like an unnecessary feature by today's standards, and the descriptions in the zipfile are always accompanied by ads to incorporate PKWARE's proprietary, patented bullshit. I just want to know that when it does happen, yauzl doesn't have some obvious bug or produce bogus data without raising alarms for an easily detectable situation. From what I've gathered in my research, yauzl needs to be careful about general purpose bits 0 and 6, and from there the existing signature checks are already sufficient.

thejoshwolfe commented 7 years ago

For reference, the previous behavior for encrypted files was:

As expected, this isn't really "undefined behavior" in the C sense where it's a security vulnerability. Really the only problem would be confusing error messages and possibly some misleading data.