kimci86 / bkcrack

Crack legacy zip encryption with Biham and Kocher's known plaintext attack.
zlib License
1.62k stars 160 forks source link

how to get 0xA9 from hex #26

Closed haj-hossein closed 3 years ago

haj-hossein commented 3 years ago

hi and thanks for make bkcrack

i have a problem in get 0xA9 and offset in your example tutorial:

i mean here


Free additional byte from CRC

In this example, we guessed the first 20 bytes of spiral.svg.

In addition, as explained in the ZIP file format specification, a 12-byte encryption header in prepended to the data in the archive. The last byte of the encryption header is the most significant byte of the file's CRC.

We can get the CRC with unzip.

$ unzip -Z -v secrets.zip spiral.svg | grep CRC
  32-bit CRC value (hex):                         a99f1d0d

So we know the byte just before the plaintext (i.e. at offset -1) is 0xA9.


for example i have 1 file in zip 1.zip --> 1.txt crc32 is : 56C6A424 and i have a 13 byte of known plain text

what is this ? \xa9 ---> echo -n -e '\xa9<?xml version="1.0" ' > plain.txt

and how You determine the offset and this xa9

thanks a lot

kimci86 commented 3 years ago

Hello, From the ZIP format specification (see PKWARE's APPNOTE.TXT, paragraph 6.1.6), we know that the most significant byte of a file's CRC is at the end of the encryption header. In your example, the file 1.txt has a CRC value of 56C6A424. So we know that the value 56 (in hexadecimal) is at offset -1 (just before actual encrypted data). The value 56 (in hexadecimal) comes from your file metadata and the offset -1 comes from the ZIP format specification (and also how bkcrack is implemented).

In the tutorial, the CRC value for the example svg file is a99f1d0d, so it gives the known byte a9. The command echo -n -e '\xa9<?xml version="1.0" ' > plain.txt is used to create a file starting with the byte a9 (in hexadecimal) followed with the typical beginning of an svg file. The flag -e tells echo to interpret the sequence \xa9 as a byte. The flag -n is used to prevent echo from adding a newline at the end.

In your case, I suggest you pass the information about the byte 56 at offset -1 with the following arguments: -x -1 56. For example: bkcrack -C 1.zip -c 1.txt -p plaintext.txt -x -1 56. Note that using the known byte from the CRC is not mandatory as you already know 13 bytes (the minimum for now is 12), but it can make the attack faster. The attack is faster when more contiguous plaintext is known, so if the known 13 bytes start at offset 0, the additional byte at offset -1 will extend the contiguous known plaintext to 14 bytes.

Did I answer your questions? Let me know if you need more information.

haj-hossein commented 3 years ago

tnx for your answer

for example my file in arabic and in known 31 bytes of begining of 1.txt file zip legacy

my known plain text is السلام عليكم ورحمة الله وبركاته and is 31 byte in plaintext

plain

i use windows version of bkcrack the crc32 is C948EE28 i use C9

command : bkcrack.exe -C 1.zip -c 1.txt -p plain.txt -x -1 C9

in plain text maked in windows

windows plaintet

and with this command : bkcrack.exe -C 1.zip -c 1.txt -p plain.txt -x -1 C9

my result is :

win res

i use too this command in linux echo -n -e '\xC9السلام عليكم ورحمة الله وبركاته' > plain.txt and i get : ةالسلام عليكم ورحمة الله وبركاته

i change plaintext to this and check again :

echo linux test

i have a plaintext but cant get the key of file ...

please tell me why ? i need help about this

so tnx for your help

kimci86 commented 3 years ago

I can see your file is compressed. To run a plaintext attack, the plaintext must also be compressed with the same algorithm. It will not be possible to get correct compressed plaintext because compression depends on the entire file (big blocks, actually) but you only know a small prefix. Sorry but I think that a plaintext attack is not possible in this case. You can try brute force or dictionary attacks with other tools likes john the ripper or hashcat.

haj-hossein commented 3 years ago

Thank you for your answer this file made by me just for test in arabic text Although I know the value in the file i can get any result is arabic text but in english test , i have a result file is zip legacy format and i known all the content of file (plain text is start line of file)

files is here : 1.zip plain.txt

kimci86 commented 3 years ago

To get correct plaintext, you should create a zip file without password containing the entire file 1.txt (not just the first line). This way, the created zip file will contain compressed bytes corresponding to those in the encrypted archive just before encryption. Using only the first line as plaintext is not enough because it has to be compressed exactly like the encrypted file.

haj-hossein commented 3 years ago

Thank you for your answer this file made by me just for test in arabic text Although I know the value in the file i can get any result is arabic text but in english test , i have a result file is zip legacy format and i known all the content of file (plain text is start line of file)

files is here : 1.zip plain.txt

Untitled

i try with compressed plaintext :

bkcrack.exe -C 1.zip -c 1.txt -P plain.zip -p plain.txt -x -1 8F

2223

in the arabic or other lan (not english) i cant get the key

kimci86 commented 3 years ago

Could you try again by compressing the entire file 1.txt (not just the first line) into plain.zip? Compressing only the first line will not generate the same compressed bytes.

haj-hossein commented 3 years ago

ok i make all for test step by step

this is my orginal file and i set password with legacy zip format 1

2

this is plain.txt

3

this is zip of plaintext

4

and now i have a this files ( 1.zip , plain.txt , plain.zip ) - plaintext is 31 byte 5

value in plain.zip 6

crc32 is : C948EE28 i use C9

start the attack with additional byte from CRC :

bkcrack-1.0.0-win64>bkcrack.exe -C 1.zip -c 1.txt -P plain.zip -p plain.txt -x -1 C9

result :

7

start the attack with normal value :

bkcrack-1.0.0-win64>bkcrack.exe -C 1.zip -c 1.txt -P plain.zip -p plain.txt

result :

10

i try with more byte in plain text (plain2.txt,plain2.zip) :

8

result :

9

try attack to all content of orginal file to plain3.txt,plain3.zip (all text of 1.txt in 1.zip)

11

i try with your example secrets.zip----> spiral.svg by this command :

plaintext is spiral.svg with 12 bytes > <?xml version="1.0" ?> plaintext compressed to zip -> plain.zip

bkcrack.exe -C secrets.zip -c spiral.svg -P spiral.zip -p spiral.svg bkcrack 1.0.0 - 2020-11-11 Generated 4194304 Z values. [19:01:16] Z reduction using 16 bytes of known plaintext 100.0 % (16 / 16) 432777 values remaining. [19:01:17] Attack on 432777 Z values at index 7 100.0 % (432777 / 432777) [19:29:52] Could not find the keys.

kimci86 commented 3 years ago

Thank you for the detailed experiment. What you experienced is not surprising. For the attack to be successful, the compressed plaintext must match the compressed data which was encrypted. Data compressed with the deflate algorithm typically starts with the representation of a Huffman tree which depends on the entire file (or big blocks of data).

The python script below illustrates this. Each prefix of a string is compressed and the first 12 bytes are printed.

import zlib

def deflate(data):
    return zlib.compress(data)[2:-4] # discard zlib header and Adler-32 checksum

data = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.".encode()

for prefix in map(lambda i: data[:i+1], range(len(data))):
    print(f"prefix {len(prefix):<3d} ->", *(f"{b:02x}" for b in deflate(prefix)[:12]))

Compressing all the data in this example gives the following bytes. 35 90 c1 71 43 31 08 44 5b d9 02 3c This is what we need to know for a successful attack if the text from the script was compressed and encrypted.

Compressing the first sentence: 25 cc d1 09 03 31 0c 04 d1 56 b6 80

Compressing the first two sentences: 25 8f 51 8e 03 31 08 43 af e2 03 54

You can see that compressing the first few words produces different compressed data than compressing the entire data. This is why a known plaintext attack is difficult unless an entire file is known when deflate compression is used in the encrypted archive.

The tutorial illustrates a simple case where no compression was used (for spiral.svg) so it is simple to guess the beginning of the plaintext.

Did I answer your questions? Do you need more help?

kimci86 commented 3 years ago

I close this as inactive. Feel free to reopen if you need more help.