openwall / john

John the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs
https://www.openwall.com/john/
Other
10.21k stars 2.09k forks source link

Add NTLMv1-opencl and MSCHAPv2-opencl formats (mostly same code path and kernel) #3160

Open magnumripper opened 6 years ago

magnumripper commented 6 years ago

Apparently these formats are still used in the wild. We should have GPU support for them. The "many salts" case has totally wild figures on CPU already - it can be even more crazy fast on GPU.

Like... if we do 10G NT hashes per second and load 500 salts of NTLMv1, we'd get, uh.... 5T c/s on GPU. Did I just have a beer too much?

magnumripper commented 6 years ago

@solardiz please help me think. Is the above a correct assumption at all?

magnumripper commented 6 years ago

Perhaps this would need a GPU kernel that is aware of all salts at once? I believe there's nothing really in the way to prevent that.

solardiz commented 6 years ago

It's been some years and I don't recall all detail anymore. IIRC, hashcat did this on GPU, so probably yes, there's room for some great speedup. OTOH, it looks like the remaining processing, which we have in cmp_*(), is done selectively - so it could be tricky to parallelize it efficiently. Since it's DES, the same applies to using a bitslice implementation, which would be relevant both on CPU and on GPU.

While we're at it, aren't the bitwise-ORs into bitmap in ntlmv1_mschapv2_fmt_plug.c's crypt_all() unsafe in OpenMP builds? Do they possibly predate our addition of OpenMP support to there? Luckily, I think this format is built without OpenMP support by default, but when it is perhaps it currently has a (low) chance of producing false negatives - when two threads would each try to set a different bit in the same bitmap array element at once - or is this somehow not possible (each thread somehow has a whole number of bitmap array elements all to itself)? A straightforward fix would be adding #pragma omp atomic before those ORs.

magnumripper commented 6 years ago

Oh, right. Non-SIMD builds are OpenMP by default. I'm adding #pragmas right away, just in case.

magnumripper commented 6 years ago

hashcat's benchmark shows NT speeds. Not sure how it will cope with many salts.