kimci86 / bkcrack

Crack legacy zip encryption with Biham and Kocher's known plaintext attack.
zlib License
1.68k stars 163 forks source link

Unable to find encryption keys with complete plaintext file and ZipCrypto Deflate #106

Closed navy3dfx closed 1 year ago

navy3dfx commented 1 year ago

First I would like to say that the efficiency of bkcrack is fascinating - I'm having fun with it in the last couple of months and always had perfect results. A great piece of software! However - I now have a deflated ZipCrypto archive from which I have a complete plaintext file. Unfortunately the file size is just 158 bytes. I am able to compress it with a lot of different tools to successfully reach the expected compress size of 106 bytes (118 bytes in the encrypted archive), but I never managed to obtain the keys. As the very exact attack algorithm is too complicated for me to understand, I would like to ask if it might be theoretically impossible to find the encryption keys because of some limitations, or I just need to continue trying with different compression options. Is such "brute-forcing" of compression methods the only way to go, or there is some clever, more efficient way to understand that I found the right compression method, without executing the attack on every try? Or do you have other recommendations for me? Thank you very much!

encrypted_config.network.dhcp.server.001.zip plaintext_config.network.dhcp.server.001.zip

Edit: I also tried the deflate python script to get a compressed plaintext, but again unsuccessfully. Can't be sure that I'm using it correctly though, since I wasn't able to find any documentation or examples...

kimci86 commented 1 year ago

Is such "brute-forcing" of compression methods the only way to go, or there is some clever, more efficient way to understand that I found the right compression method, without executing the attack on every try?

In addition to checking the compressed size is the expected one before trying, you could also look at the output of different tools and configurations and try only once those that generate the same output. Besides that, I don't have a better answer.

Or do you have other recommendations for me?

I had a look at your files and I confirm I did not (at first) find a solution using zlib compression or 7-zip compression with various compression levels.

However, I noticed the ZIP file looks a little strange. It includes comments (which is valid in ZIP format but still not so common to see) and some metadata about file last modification date and time is missing (bytes for this metadata are set to zero). Maybe this ZIP file was made by a custom tool or modified for a challenge? Maybe this ZIP file does not completely conform to ZIP format specification?

With this in mind, I questioned an assumption that is true for usual ZIP files but might not be true here. This assumption is that the check byte, that is to say the last byte of the encryption header (12 bytes prepended to compressed data before encryption), can be derived from entry metadata. This assumption is used by bkcrack to derive an additional byte of known plaintext. To prevent this behavior, there is an option --ignore-check-byte.

Conclusion: it turns out this assumption indeed is not true for this ZIP file and your compressed plaintext was correct. We can get the solution with this command:

bkcrack -C encrypted_config.network.dhcp.server.001.zip --cipher-index 0 -P plaintext_config.network.dhcp.server.001.zip --plain-index 0 --ignore-check-byte
navy3dfx commented 1 year ago

Thank you very much for the detailed response! If I understood correctly - according to the ZIP specification, the last header byte is a checksum and bkcrack calculates it in order to gain additional plaintext and reduce calculation time. In fact, using the --ignore-check-byte option results in successful calculation of keys. I'm not, however, able to decrypt the archive with those keys as it seems that they are not correct. I tried the attack against a couple of compressed plaintext archives and the results are consistent. The encrypted archive is a backup file made by the web interface of a gateway, running Linux. Since all files inside have the same attributes and comments, I guess that it's some kind of a custom script using a readily available library - zlib, libzip or something...

kimci86 commented 1 year ago

If I understood correctly - according to the ZIP specification, the last header byte is a checksum and bkcrack calculates it in order to gain additional plaintext and reduce calculation time.

Yes. The check byte is either a byte from the entry's CRC or from last modification time depending on some bit in entry's metadata.

I'm not, however, able to decrypt the archive with those keys as it seems that they are not correct.

Do you mean you tried to set a different password with option -U and then extract with an archive manager? Surely that would fail because an archive manager would also expect to find the check byte. It would consider the provided password to be wrong if the last deciphered encryption header byte does not match the expected value derived from metadata.

Instead, you can decipher entries one by one using bkcrack and decompress them using the inflate script from bkcrack's tools folder like this:

bkcrack -k cf7ee7cc 39947cd4 fa867eda -C encrypted_config.network.dhcp.server.001.zip --cipher-index 0 -d deciphered_data
tools/inflate.py < deciphered_data > decompressed.txt
navy3dfx commented 1 year ago

Thank you very much, I confirm that manual extraction works! It would be great if you add a contribution possibility in the readme :)