openwall / john

John the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs
https://www.openwall.com/john/
Other
10.4k stars 2.11k forks source link

zip2john can't handle properly some zip files #2418

Closed claudioandre-br closed 7 years ago

claudioandre-br commented 7 years ago

I failed to find any reported bug.

Steps to reproduce

IMPORTANT

magnumripper commented 7 years ago
dependencies-gtk2.zip:$zip2$*0*0*0*00000000*0000*93*00000001990700010041450308000b00000000000000d26d3872a57f0000867f7a2d46634fd883c3f0ee3d31437746aec1bddcb42546031303252538cc71a67e54424012eff4b045b116dced466071416f7bd6314c289d0ca6ec568400c63cede4cf5c963b7217f1ee9c9873c6ec5f75f1033dfa6f8548c7caafad0550715cb74899b62a423827d04cd0a4723a4ba26859fb98*a438acbafabe6e8e8a8e*$/zip2$:::::dependencies-gtk2.zip

It has "compression method 99" according to zipinfo. Apparently that's LZFSE.

claudioandre-br commented 7 years ago

Are you sure?

I created this file using a Linux GUI tool. Can't see it using the Apple 2016 LZFSE:

magnumripper commented 7 years ago

No that was just from last comment on http://juljas.net/lpt/post/zip-compression-method-99

But 99 it is.

kholia commented 7 years ago

Quick notes,

WinZip use compression method 99 when using the AES encryption. 7-Zip is able to successfully process this file, and 7-Zip is hitting the code path corresponding to WinZip AES encryption internally.

Something is going wrong with ZIP parsing and hash generation parts in zip2john.c?

@claudioandre How did you generate this ZIP file? Which software did you use?

http://www.winzip.com/aes_info.htm is relevant here.

kholia commented 7 years ago

In the attached zip file,

00000000: 504b 0304 2d00 0300 6300 d67e 2b4a ca4f  PK..-...c..~+J.O
00000010: 1455 b700 0000 de01 0000 1500 1f00 6465  .U............de
00000020: 7065 6e64 656e 6369 6573 2d67 746b 322e  pendencies-gtk2.
00000030: 7478 7401 0010 0000 0000 0000 0000 0000  txt.............
00000040: 0000 0000 0000 0001 9907 0001 0041 4503  .............AE.
00000050: 0800 0b00 0000 0000 0000 d26d 3872 a57f  ...........m8r..

There are some extra 20 bytes present between the end of "txt" string and start of Extra field header ID (0x9901). I wonder what these bytes are for, and how to detect them programmatically.

kholia commented 7 years ago

Here is a patch to fix the parsing problems,

$ git diff
diff --git a/src/zip2john.c b/src/zip2john.c
index 83be14d..7c4dea7 100644
--- a/src/zip2john.c
+++ b/src/zip2john.c
@@ -170,17 +170,32 @@ static void process_file(const char *fname)
                        filename[filename_length] = 0;

                        if (compression_method == 99) { /* AES encryption */
+#define AES_EXTRA_DATA_LENGTH 11
                                uint64_t real_cmpr_len;
-                               uint16_t efh_id = fget16LE(fp);
-                               uint16_t efh_datasize = fget16LE(fp);
-                               uint16_t efh_vendor_version = fget16LE(fp);
-                               uint16_t efh_vendor_id = fget16LE(fp);
-                               char efh_aes_strength = fgetc(fp);
-                               uint16_t actual_compression_method = fget16LE(fp);
+                               uint16_t efh_id;
+                               uint16_t efh_datasize;
+                               uint16_t efh_vendor_version;
+                               uint16_t efh_vendor_id;
+                               char efh_aes_strength;
+                               uint16_t actual_compression_method;
                                unsigned char salt[16], d;
                                char *bname;
                                int magic_enum = 0;  // reserved at 0 for now, we are not computing this (yet).

+                               if (extrafield_length > AES_EXTRA_DATA_LENGTH)
+                                       fseek(fp, extrafield_length - AES_EXTRA_DATA_LENGTH, SEEK_CUR);
+                               efh_id = fget16LE(fp);
+                               efh_datasize = fget16LE(fp);
+                               efh_vendor_version = fget16LE(fp);
+                               efh_vendor_id = fget16LE(fp);
+                               efh_aes_strength = fgetc(fp);
+                               actual_compression_method = fget16LE(fp);
+
+                               if (efh_id != 0x9901) {
+                                       fprintf(stderr, "Unable to parse %s which is using AES encryption!\n", fname);
+                                       goto cleanup;
+                               }
+
                                strnzcpy(path, fname, sizeof(path));
                                bname = basename(path);
                                cp = cur;

After this patch a seemingly valid hash is generated by zip2john but I am unable to crack this hash with JtR.

kholia commented 7 years ago

With updated PR https://github.com/magnumripper/JohnTheRipper/issues/2418, I am able to generate the correct hash, and crack it with JtR.

@claudioandre Which software did you use to create the ZIP file?

claudioandre-br commented 7 years ago

@claudioandre Which software did you use to create the ZIP file?

B1 Free Archiver

kholia commented 7 years ago

The 20 bytes between the local header and AES header in this file contain the Zip64 extended information (0x0001). We can safely skip over the bytes.

PR https://github.com/magnumripper/JohnTheRipper/pull/2856 can be further improved to parse and evaluate these extra field header id(s) instead of blindly jumping over stuff.

kholia commented 7 years ago

This should be fixed now with PR #2856.