Closed haj-hossein closed 3 years ago
Hello,
From the ZIP format specification (see PKWARE's APPNOTE.TXT, paragraph 6.1.6), we know that the most significant byte of a file's CRC is at the end of the encryption header.
In your example, the file 1.txt
has a CRC value of 56C6A424.
So we know that the value 56 (in hexadecimal) is at offset -1 (just before actual encrypted data).
The value 56 (in hexadecimal) comes from your file metadata and the offset -1 comes from the ZIP format specification (and also how bkcrack is implemented).
In the tutorial, the CRC value for the example svg file is a99f1d0d, so it gives the known byte a9. The command echo -n -e '\xa9<?xml version="1.0" ' > plain.txt
is used to create a file starting with the byte a9 (in hexadecimal) followed with the typical beginning of an svg file. The flag -e tells echo to interpret the sequence \xa9 as a byte. The flag -n is used to prevent echo from adding a newline at the end.
In your case, I suggest you pass the information about the byte 56 at offset -1 with the following arguments: -x -1 56
.
For example: bkcrack -C 1.zip -c 1.txt -p plaintext.txt -x -1 56
.
Note that using the known byte from the CRC is not mandatory as you already know 13 bytes (the minimum for now is 12), but it can make the attack faster.
The attack is faster when more contiguous plaintext is known, so if the known 13 bytes start at offset 0, the additional byte at offset -1 will extend the contiguous known plaintext to 14 bytes.
Did I answer your questions? Let me know if you need more information.
tnx for your answer
for example my file in arabic and in known 31 bytes of begining of 1.txt
my known plain text is السلام عليكم ورحمة الله وبركاته and is 31 byte in plaintext
i use windows version of bkcrack the crc32 is C948EE28 i use C9
command : bkcrack.exe -C 1.zip -c 1.txt -p plain.txt -x -1 C9
in plain text maked in windows
and with this command : bkcrack.exe -C 1.zip -c 1.txt -p plain.txt -x -1 C9
my result is :
i use too this command in linux echo -n -e '\xC9السلام عليكم ورحمة الله وبركاته' > plain.txt and i get : ةالسلام عليكم ورحمة الله وبركاته
i change plaintext to this and check again :
i have a plaintext but cant get the key of file ...
please tell me why ? i need help about this
so tnx for your help
I can see your file is compressed. To run a plaintext attack, the plaintext must also be compressed with the same algorithm. It will not be possible to get correct compressed plaintext because compression depends on the entire file (big blocks, actually) but you only know a small prefix. Sorry but I think that a plaintext attack is not possible in this case. You can try brute force or dictionary attacks with other tools likes john the ripper or hashcat.
Thank you for your answer this file made by me just for test in arabic text Although I know the value in the file i can get any result is arabic text but in english test , i have a result file is zip legacy format and i known all the content of file (plain text is start line of file)
To get correct plaintext, you should create a zip file without password containing the entire file 1.txt (not just the first line). This way, the created zip file will contain compressed bytes corresponding to those in the encrypted archive just before encryption. Using only the first line as plaintext is not enough because it has to be compressed exactly like the encrypted file.
Thank you for your answer this file made by me just for test in arabic text Although I know the value in the file i can get any result is arabic text but in english test , i have a result file is zip legacy format and i known all the content of file (plain text is start line of file)
files is here : 1.zip plain.txt
i try with compressed plaintext :
bkcrack.exe -C 1.zip -c 1.txt -P plain.zip -p plain.txt -x -1 8F
in the arabic or other lan (not english) i cant get the key
Could you try again by compressing the entire file 1.txt (not just the first line) into plain.zip? Compressing only the first line will not generate the same compressed bytes.
ok i make all for test step by step
this is my orginal file and i set password with legacy zip format
this is plain.txt
this is zip of plaintext
and now i have a this files ( 1.zip , plain.txt , plain.zip ) - plaintext is 31 byte
value in plain.zip
crc32 is : C948EE28 i use C9
start the attack with additional byte from CRC :
bkcrack-1.0.0-win64>bkcrack.exe -C 1.zip -c 1.txt -P plain.zip -p plain.txt -x -1 C9
result :
start the attack with normal value :
bkcrack-1.0.0-win64>bkcrack.exe -C 1.zip -c 1.txt -P plain.zip -p plain.txt
result :
i try with more byte in plain text (plain2.txt,plain2.zip) :
result :
try attack to all content of orginal file to plain3.txt,plain3.zip (all text of 1.txt in 1.zip)
i try with your example secrets.zip----> spiral.svg by this command :
plaintext is spiral.svg with 12 bytes > <?xml version="1.0" ?> plaintext compressed to zip -> plain.zip
bkcrack.exe -C secrets.zip -c spiral.svg -P spiral.zip -p spiral.svg bkcrack 1.0.0 - 2020-11-11 Generated 4194304 Z values. [19:01:16] Z reduction using 16 bytes of known plaintext 100.0 % (16 / 16) 432777 values remaining. [19:01:17] Attack on 432777 Z values at index 7 100.0 % (432777 / 432777) [19:29:52] Could not find the keys.
Thank you for the detailed experiment. What you experienced is not surprising. For the attack to be successful, the compressed plaintext must match the compressed data which was encrypted. Data compressed with the deflate algorithm typically starts with the representation of a Huffman tree which depends on the entire file (or big blocks of data).
The python script below illustrates this. Each prefix of a string is compressed and the first 12 bytes are printed.
import zlib
def deflate(data):
return zlib.compress(data)[2:-4] # discard zlib header and Adler-32 checksum
data = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.".encode()
for prefix in map(lambda i: data[:i+1], range(len(data))):
print(f"prefix {len(prefix):<3d} ->", *(f"{b:02x}" for b in deflate(prefix)[:12]))
Compressing all the data in this example gives the following bytes.
35 90 c1 71 43 31 08 44 5b d9 02 3c
This is what we need to know for a successful attack if the text from the script was compressed and encrypted.
Compressing the first sentence:
25 cc d1 09 03 31 0c 04 d1 56 b6 80
Compressing the first two sentences:
25 8f 51 8e 03 31 08 43 af e2 03 54
You can see that compressing the first few words produces different compressed data than compressing the entire data. This is why a known plaintext attack is difficult unless an entire file is known when deflate compression is used in the encrypted archive.
The tutorial illustrates a simple case where no compression was used (for spiral.svg) so it is simple to guess the beginning of the plaintext.
Did I answer your questions? Do you need more help?
I close this as inactive. Feel free to reopen if you need more help.
hi and thanks for make bkcrack
i have a problem in get 0xA9 and offset in your example tutorial:
i mean here
Free additional byte from CRC
In this example, we guessed the first 20 bytes of
spiral.svg
.In addition, as explained in the ZIP file format specification, a 12-byte encryption header in prepended to the data in the archive. The last byte of the encryption header is the most significant byte of the file's CRC.
We can get the CRC with
unzip
.So we know the byte just before the plaintext (i.e. at offset -1) is 0xA9.
for example i have 1 file in zip 1.zip --> 1.txt crc32 is : 56C6A424 and i have a 13 byte of known plain text
what is this ? \xa9 ---> echo -n -e '\xa9<?xml version="1.0" ' > plain.txt
and how You determine the offset and this xa9
thanks a lot