Closed kholia closed 7 years ago
Should pfx_fmt_plug.c be redone using the pkcs12_plug.c functions, and rip it off of the oSSL / algorithm anchor?
We already have a new pfx_ng_fmt_plug.c
file which uses the new PKCS#12 code. We can simply drop the old pfx_fmt_plug.c
file when the new format is feature complete (and fast).
To do SIMD, we will need to do this like we do the PBKDF2 stuff, where each format will need to handle loading proper arrays of input, then calling the PKCS#12 functions. Should be doable for sure.
NOTE:
int mbedtls_pkcs12_derivation( unsigned char *data, size_t datalen, const
unsigned char *pwd, size_t pwdlen, const unsigned char *salt,
size_t saltlen, int md_type, int id, int iterations )
{
unsigned int j;
unsigned char diversifier[128];
unsigned char salt_block[128], pwd_block[128], hash_block[128];
unsigned char hash_output[1024];
unsigned char *p;
unsigned char c;
size_t hlen, use_len, v, i;
SHA_CTX md_ctx;
// This version only allows max of 64 bytes of password or salt
if( datalen > 128 || pwdlen > 64 || saltlen > 64 )
return -1; // MBEDTLS_ERR_PKCS12_BAD_INPUT_DATA
hlen = 20; // for SHA1
if( hlen <= 32 )
v = 64;
else
v = 128;
Since we fail based on pwlen, which is (plaintext_len << 1) + 2 should we be limiting plaintext length of all formats using this code to 31 bytes ???
Question is for @kholia
$ ../run/john -test -form=bks
Will run 8 OpenMP threads
Benchmarking: BKS [PKCS12 PBE SHA-1 32/64]... (8xOMP) DONE
Raw: 12331 c/s real, 1892 c/s virtual
$ ../run/john -test -form=bks
Will run 8 OpenMP threads
Benchmarking: BKS [PKCS12 PBE 128/128 AVX 4x]... (8xOMP) DONE
Raw: 29545 c/s real, 7773 c/s virtual
Only 2.4 now (still POC). Note, it is not 100% SIMD, I only SIMD the iteration block. BUT this is only 4x SIMD. We should see a 5x improvement for 8x SIMD chips (with the current POC code).
$ ../run/john -test -form=pfx-ng
Will run 8 OpenMP threads
Benchmarking: pfx-ng [PKCS12 PBE (.pfx, .p12) (SHA-1 to SHA-512) 32/64]... (8xOMP) DONE
Raw: 15860 c/s real, 2344 c/s virtual
$ ../run/john -test -form=pfx
Will run 8 OpenMP threads
Benchmarking: PFX, PKCS12 (.pfx, .p12) [32/64]... (8xOMP) DONE
Raw: 16659 c/s real, 5410 c/s virtual
$ ../run/john -test -form=pfx-ng
Will run 8 OpenMP threads
Benchmarking: pfx-ng [PKCS12 PBE (.pfx, .p12) (SHA-1 to SHA-512) 128/128 AVX 4x]... (8xOMP) DONE
Raw: 55986 c/s real, 7908 c/s virtual
That looks a bit better ;) Looks like the BKS has a lot of data that is done in the final hmac. We do not have SIMD code for hmac (but might be able to steal logic from pbkdf2-hmac-sha1.h file).
Ok, getting ready to check this in. Now the speeds are 'better'. I also have 1 line in pkcs12.h which we can set to #if 1 and build in non-SIMD mode, for easier testing.
$ ../run/john -test -form=bks
Will run 8 OpenMP threads
Benchmarking: BKS [PKCS12 PBE 32/64]... (8xOMP) DONE
Raw: 11663 c/s real, 1723 c/s virtual
$ ../run/john -test -form=pfx-ng
Will run 8 OpenMP threads
Benchmarking: pfx-ng [PKCS12 PBE (.pfx, .p12) (SHA-1 to SHA-512) 32/64]... (8xOMP) DONE
Raw: 13807 c/s real, 2116 c/s virtual
$ vi pkcs12.h
$ make -s
Make process completed.
$ ../run/john -test -form=bks
Will run 8 OpenMP threads
Benchmarking: BKS [PKCS12 PBE 128/128 AVX 4x]... (8xOMP) DONE
Raw: 49399 c/s real, 6805 c/s virtual
$ ../run/john -test -form=pfx-ng
Will run 8 OpenMP threads
Benchmarking: pfx-ng [PKCS12 PBE (.pfx, .p12) (SHA-1 to SHA-512) 128/128 AVX 4x]... (8xOMP) DONE
Raw: 57510 c/s real, 8465 c/s virtual
Both are about 4.1-4.2x faster on my AVX (SIMD-4x) so being > 4x makes me smile a bit ;)
Gonna just directly check in the SIMD changes. This is NOT opencl, someone else can play with that one ;)
NOTE, I set Plaintext length to 31 for both of these formats (until I hear otherwise).
@kholia please test on your AVX2, to make sure I did not bone anything. I have only tested on AVX (SIMD-4x).
Does PKCS12 also use other hashing algo? I am surprised that only sha1 is supported, and would think all (or many) SHA2 hashes should also be there.
Before SIMD,
$ ../run/john --format=pfx-ng --test
Will run 4 OpenMP threads
Benchmarking: pfx-ng [PKCS12 PBE (.pfx, .p12) (SHA-1 to SHA-512) 32/64]... (4xOMP) DONE
Raw: 11080 c/s real, 2797 c/s virtual
After SIMD changes,
$ ../run/john --format=pfx-ng --test
Will run 4 OpenMP threads
Benchmarking: pfx-ng [PKCS12 PBE (.pfx, .p12) (SHA-1 to SHA-512) 256/256 AVX2 8x]... (4xOMP) DONE
Raw: 62816 c/s real, 15743 c/s virtual
Speed-up is around 5.6x.
Does PKCS12 also use other hashing algo? I am surprised that only sha1 is supported, and would think all (or many) SHA2 hashes should also be there.
Yes, other hashes are supported too. See /run/pfxng2john.py
for a list. Various SHA2 hashes are there.
I should have named mbedtls_pkcs12_derivation
as mbedtls_pkcs12_derivation_sha1
in the first place.
do you have the specific differences. I would like to take a shot at getting at least non-SIMD of all hashes. I see:
sha1, 224, 256, 384, 512. Also 512_224/512_256 which I am not sure john handles. But I REALLY think we should target 256 and 512 out of the gate. Yes, there 'are' others, but I think sha1 (default), along with the 'normal' sha2 hashes should be what we get working now.
Since we fail based on pwlen, which is (plaintext_len << 1) + 2 should we be limiting plaintext length of all formats using this code to 31 bytes ?
I am not sure about this one. The "pwdlen > 64" check in mbedtls_pkcs12_derivation
seems to be a limitation of this particular implementation.
https://github.com/doublereedkurt/pyjks/blob/master/jks/rfc7292.py#L21 has another implementation of PKCS#12 key derivation.
For now, setting PLAINTEXT_LENGTH to 32 is OK I think.
32 would not work, since the UTF16 null looks like it is required.
But now that I have figure out how to generate these things, I can build hashes with longer passwords, to see just what would crack. I see no reason why it should not 'work', but i really do not understand the algo 'yet'
I can attempt getting sha256 support working. Give me a day or two.
32 was a typo in my last comment, I meant 31 :-)
Generating a PKCS#12 Private Key and Public Certificate
=======================================================
1. Generate an RSA private key
openssl genrsa -out openwall.key 1024
2. Generate a Certificate Signing Request
openssl req -new -key openwall.key -out openwall.csr
3. Generate a self-signed public certificate based on the request
openssl x509 -req -days 3650 -in openwall.csr -signkey openwall.key -out openwall.crt
4. Generate a PKCS#12 file
openssl pkcs12 -keypbe PBE-SHA1-3DES -certpbe PBE-SHA1-3DES -export -in openwall.crt -inkey openwall.key -out openwall.pfx -name "openwall"
$ openssl pkcs12 -keypbe PBE-SHA1-3DES -macalg sha256 -certpbe PBE-SHA1-3DES -export -in openwall.crt -inkey openwall.key -out test12345.pfx -name "test12345" # for SHA256
You can use this information to generate .pfx files for testing.
NOTE, it appears that 30 char pw is 'max' for the current code (sha1) for pfx-ng I built a 30 char and a 31 char, and only the 30 char can be found.
new issue added #2183 (original pfx format does not have this problem)
Sha256 SIMD added (along with cost stuff since we now have algo costs). Only saw 2.5x improvement for SHA256, but i think the SIMD improvements are a bit less for that hash. Also, the hmac being done in oSSL for sha256, 'may' have a larger impact on overall speed.
sha256 also seems to have 30 byte max password length for pfx-ng
NOTE, pfx format also is only finding 30 byte passwords. I wonder if our *2john are right? Possibly, 30 character is MAX possible??? (My bad, that was ng)
@kholia I already have sha512 in there, BUT I have to find/fix a problem with SHA256 simd code (I think I know the issue). Once I get sha256 figured out, the 512 code should be trivial (the oSSL version / pfxng2john.py code already is working fine).
The actual length limit of 30 needs further investigation. This said, 30 should not be a problem in practice.
Ok, all 3 hashes (sha1, sha256, sha512), for SIMD (at least) only detect 30, and not 31:
Using default input encoding: UTF-8
Loaded 6 password hashes with 6 different salts (pfx-ng [PKCS12 PBE (.pfx, .p12) (SHA-1 to SHA-512) 128/128 XOP 4x])
Loaded hashes with cost 1 (mac-type) varying from 1 to 512
Will run 2 OpenMP threads
Press 'q' or Ctrl-C to abort, almost any other key for status
123456789012345678901234567890 (sha256-30.pfx)
123456789012345678901234567890 (sha512-30.pfx)
123456789012345678901234567890 (sha1-30.pfx)
3g 0:00:00:00 DONE (2016-07-27 10:54) 11.95g/s 59.76p/s 274.9c/s 274.9C/s a..12345678901234567890
Use the "--show" option to display all of the cracked passwords reliably
Session completed
I will check to make sure non-SIMD has same limitation, but I would bet it does (and will only post a follow up if I find otherwise)
It should be easy to increase that count, but we will probably have to dig into oSSL source, and see just how the data is used. Likely it will simply be within the pkcs12_fill_buffer function(s)
Ok, I am able to get 31 byte passwords to work. BUT to do that, I have to set PLAINTEXT_LENGTH to 32 ?!?!?! Yes, we do need to figure this out. It really should NOT be an issue to make the password length longer. The encryption of the password seems to play very little in overall speed. It simply preloads the hash contexts. The main loop is just re-encrypting the prior steps, thus the speed will be constant.
I have been able to increase PLAINTEXT_LENGTH up (at least on sha1). I think SHA256 would be the same. I am surprised that > 31 for SHA512 was not working (it may have been already.
However, it appears there is some max length WITHIN the openssl process on key length. We should find that out, and make sure that we can at least handle up to that size. (it is less than 60 I think).
Ok, this DEBUG_KEYGEN build is oSSL shows (I think) the limit. The password is 1234567890123.....90 (50 bytes long.) But look at the 'last' password length (only 100). The last '0' character was truncated off, so only a 49 byte password was used, AND cracks the hash if used.
NOTE, this may also simply be a bug in the oSSL version I am building against (1.0.2d)
...
Password (length 102):
003100320033003400350036003700380039003000310032003300340035003600370038003900300031003200330034003500360037003800390030003100320033003400350036003700380039003000310032003300340035003600370038003900300000
Salt (length 8):
873C6DEB148CFB1E
Output KEY (length 8)
A3A80D046B9E4EC5
KEYGEN DEBUG
ID 1, ITER 2048
Password (length 102):
003100320033003400350036003700380039003000310032003300340035003600370038003900300031003200330034003500360037003800390030003100320033003400350036003700380039003000310032003300340035003600370038003900300000
Salt (length 8):
141846579E42A530
Output KEY (length 24)
C072036CF5FCEC196518EF0FD0A5F4A97F928D5B78FD2B90
KEYGEN DEBUG
ID 2, ITER 2048
Password (length 102):
003100320033003400350036003700380039003000310032003300340035003600370038003900300031003200330034003500360037003800390030003100320033003400350036003700380039003000310032003300340035003600370038003900300000
Salt (length 8):
141846579E42A530
Output KEY (length 8)
8708CF9E72DF6EF5
KEYGEN DEBUG
ID 3, ITER 2048
Password (length 100):
00310032003300340035003600370038003900300031003200330034003500360037003800390030003100320033003400350036003700380039003000310032003300340035003600370038003900300031003200330034003500360037003800390000
Salt (length 8):
AB4BEC920EB254ED
Output KEY (length 20)
CCB8176E3C77DBC121963BC0CAB87FD0C46514D9
The final 'key' CCB8...14D9 is the key generated by mbedtls_pkcs12_derivation() if we use the first 49 bytes of that number password.
sha224/sha384 added (non-simd). I am not sure the ROI on SIMD for those hashes are worthwhile. Likely they will not be seen ITW.
Also, that python lib only does sha1, 224, 256, 384 and 512. oSSL does Whirlpool, sha0, md5, md4, mdc2 and likely others, but those were the only ones macalgo's I could get oSSL to do, there were others documented ,but my cygwin ossl would not do them (like RIPEMD160, blake2, etc). I think if we want 'full' PCSS#12 support, we will need to write our own .c pfgng2john.c conversion program, not using python. But again, is that worth the ROI?
No, I don't think that full PKCS#12 support is worth the effort. I think that the existing Python script can easily support all the possible mac algorithms.
The asn1crypto lib does macalgo of sha1, sha224-512 only. But that likely can be 'deemed' as full enough support.
PLAINTEXT_LENGTH bumped up to 48 for all formats. From my usage of openssl, this seems like the max it will do anyway. If you add a 49 byte pw, ossl trims it to 48. Also, if you add a longer password, ossl bails saying no password entered. (likely a bug in ossl, but I do not care). 48 byte password checking is long enough.
The SIMD code is done for now. All sha1 and sha2 are handled by the format. SIMD code is there for all hashes except the textbook ones of sha224 and sha384 (which no one ITW will use for real hashes).
36a7264
Jetico BestCrypt also makes use of this KDF. It would be nice to get an OpenCL port of this KDF.
What exactly do we need? Just saying "PKCS#12 KDF" is akin to saying "PBKDF2" without stating a PRNG.
Oh, right! Starting with the OpenCL port of PKCS#12 KDF with SHA-1 as the hashing function would be good.
This is partially done in https://github.com/magnumripper/JohnTheRipper/pull/2583.
@magnumripper Great work in commit 22eb8e345ebd1b715dbc62a3c1387db186a234ef 👍
I think that the license of src/opencl_pkcs12.h
file needs to be the same as the pkcs12_plug.c
one. I will send a pull request soon to fix this.
This issue can be closed now.
PKCS#12 KDF resides in the "pkcs12_plug.c" file.
It would be great to have a SIMD and OpenCL version of it.