Open magnumripper opened 3 years ago
I see now my initial attempt does work, provided the encrypted zip is stored, not deflated:
$ echo "Test data alpha bravo charlie echo delta fox golf hotel" > test.txt
$ rm -f test.zip && zip -0 -e test.zip test.txt
Enter password:
Verify password:
adding: test.txt (deflated 2%)
$ ./bkcrack -C test.zip -c test.txt -p test.txt
bkcrack 1.3.0 - 2021-08-16
[20:26:39] Z reduction using 61 bytes of known plaintext
100.0 % (61 / 61)
[20:26:39] Attack on 134339 Z values at index 7
Keys: a5025690 1257b418 cee8bad2
30.3 % (40665 / 134339)
[20:27:19] Keys
a5025690 1257b418 cee8bad2
So I guess there's no bug, just a confused user. Maybe some clarifications in the documentation: Apparently the "plaintext" must be as-is in the attacked archive, so we have to match deflated-or-not and so on.
This however makes me wonder when the -x
and -o
options are usable at all... they're not of much use unless the attacked file is stored, right?
Hello, Thank you for reporting this with great details.
You understood correctly, plaintext must be (a part of) the encrypted data just before encryption, which means it might have to be compressed. It often confuses people. I need to document this better and have more explicit error messages.
Also you are right, -x
and -o
options are probably never useful when compression is used because a large chunk of uncompressed data is required to get the right compressed data. There could be a warning message when they are used on compressed data.
So I assume (only now) this also applies to the -t size
option: If used with compressed data, we're talking compressed size
, right?
I think I understand it all now but there's some documentation needed for making it clear to a newbie.
Yes, when compression is used, -t
refers to compressed data. This can be useful for compressed data because, depending on the compression settings, compressed data can start the same but diverge at some point. I should document this too.
-t
refers to compressed data. This can be useful for compressed data because, depending on the compression settings, compressed data can start the same but diverge at some point.
Oh, right, that's a good point.
This all is obviously down to seeing things in their right "layers" just like with networking: bkcrack only attacks the archive data so if the attacked file in the attacked archive is deflated with parameters so and so, everything is in terms of deflated data with such parameters. Not sure how to put it well in a usage blob 😵
That being said, I guess theoretically there could be code added for user saying "plaintext is literally alpha bravo charlie delta
" (or my original try of -p plainfile
) even though the attacked archive is deflated - at least as long as there is no offset involved. We'd just have to automagically deflate the given plaintext (using settings seen in the -C encrypted.zip -c cipher
) and then use that.
That would probably add even more confusion for a newbie though 😆 🤣
This discussion is super informative, but I'm still confused about how this is supposed to work if the target file is deflated and the known plaintext is too short to compress.
I have an encrypted ZIP that contains deflated PDFs. It's likely that each PDF (when expanded) starts with the following 15 bytes of plaintext, which I've extracted into a file 15 bytes long:
$ xxd pdf-head.dat
00000000: 2550 4446 2d31 2e37 0d0a 25b5 b5b5 b5 %PDF-1.7..%....
$ ls -l pdf-head.dat
-rw-r--r-- 1 jpatokal staff 15 20 Aug 21:31 pdf-head.dat
Per the discussion above, I need to deflate this to get it to match the bytes in the target, but a string of 15 chars is too short to allow deflating:
$ zip plain.zip pdf-head.dat
adding: pdf-head.dat (stored 0%)
Is there a way around this? For example, passing in a longer deflated ZIP of a similar PDF and guesstimating how many bytes would be the same?
When deflate compression can compress, it usually uses Huffman coding (and also LZ algorithm) with a Huffman tree built from a large block of data. A compressed block starts with a compressed representation of the Huffman tree. Then there is the compressed data. So it is hard to get correct plaintext for compressed data when as few bytes are known.
You can try to compress similar PDF files and hope the Huffman tree and the first few compressed bytes will be the same. As PDF files already contain compressed data, the entropy is high and maybe the Huffman tree will be the same. I do not known how likely it is to work. Probably not much.
What is expected from
file
when using-p file
without-P plain.zip
?The examples mention either
-C encrypted.zip -c cipher -P plain.zip -p plain
or-c cipherfile -p plainfile
. I tried using a mix of them, as in-C encrypted.zip -c cipher -p plainfile
- where (in my mind but perhaps not in bkcrack's)cipher
was a file withinencrypted.zip
whileplainfile
was a plain file in my pwd simply containing the plaintext as-is - and that was accepted but it didn't work at all.If that is supposed to work, it doesn't seem to, might be a bug.
If it's not supposed to work at all without
-P
, it should bail with some informative error. I got confusing errors such asData error: plaintext offset is too large.
orData error: ciphertext is smaller than plaintext.
. With some combination of options (I think adding-t
to the mix) I got it to run but it could not find the keys (these were all test runs with staged data - it should have).Perhaps it is supposed to work, but only if
plainfile
is extracted (eg. withdd
) from an unencrypted archive? I did try to add myplainfile
to a dummy, unencrypted zip file and then use-P dummy.zip -p plainfile
and that did work just fine. If this is it, maybe just document it better.Example of what not worked:
Here's what worked fine:
I could use those keys to crack the actual password eg. with hashcat.