besser82 / libxcrypt

Extended crypt library for descrypt, md5crypt, bcrypt, and others
GNU Lesser General Public License v2.1
195 stars 53 forks source link

crypt_r yields a segmentation fault #85

Closed e2002e closed 5 years ago

e2002e commented 5 years ago

hi, I build the lib and test it, works good, but I cannot use crypt_r that yields a segfault (unaligned strcpy). Is it working ?

zackw commented 5 years ago

That's not supposed to happen. Could you please provide us with a test case -- a short program that we can compile and run for ourselves to observe the same crash that you are getting? Also we need to know what CPU and operating system you are using.

e2002e commented 5 years ago

okay, my code looks like this: for(int i=0; i<hashcount; i++) { struct crypt_data datas; datas.initialized = 0; if(strcmp(crypt_r(tmp1, salt[i], &datas), hash[i]) == 0) } where salt is a list of salt declared extern, same for hash, tmp1 is a simple pointer to a buffer. compiled on x86_64 Debian 4.9.0-6 I installed libxcrypt via apt and the issue is gone. If you really need a runnable code please try zhou on my repositories, that's the project i'm working on, and the code as it is right now couldn't run with your library. it's very light.

zackw commented 5 years ago

I'm confused. Are you saying that you are seeing a crash when you DO use libxcrypt, or are you saying that you are seeing a crash when you DON'T use libxcrypt ?

Also it's bizarre that you would be getting an alignment-related crash on x86-64, which makes me think there's something unusual about your computer or configuration and the crash wouldn't happen for me. This is why I really need you to write a complete self-contained test program please. If I make one up or if I tinker with your "zhou" program, I'll probably get it wrong.

You should be aware that libxcrypt is not designed to be used for cracking applications; it takes extra security precautions that aren't necessary in that context, and its implementations of older hashing methods are not the fastest. You may have better luck modifying John the Ripper to do what you want.

e2002e commented 5 years ago

It crashed with this version compiled from source, with the libxcrypt provided by apt it runs. as for a runnable code I've simply compiled this:

include

include

int main() { struct crypt_data datas; datas.initialized = 0; crypt_r("wah", "gwaan", &datas); } and it crashes. As for the speed you might be surprised if you try it and compare it to john. It's actually about 4 time faster on an i5 and (confusing) the same speed on an amd 8 cores. Yet jumbo-john is just ~ 2 times slower on my intel, but totally beats my software on the amd. That's surprising to have so much difference. I am going to go crazy if I have to look at the uncommented source of john again.

solardiz commented 5 years ago

@e2002e Really weird stuff - the segfault, "unaligned strcpy" (strcpy isn't meant to require alignment, and x86 normally doesn't require alignment except with SIMD), the speeds ratios, and you getting any speed measurements at all for a program that you say crashes. This is very confusing. You'd need to provide some very specific info for anyone to look into this - e.g., a gdb backtrace and disasm around the crash, specific speeds for specific tests, and how you're getting those for your program despite of the crash (perhaps for a different program, then). Maybe that other program is so fast because it doesn't actually compute any hashes (the call returns with failure)?

@zackw BTW, formally don't we require struct crypt_data to be fully zeroized before use, with its individual fields opaque to users of that API?

solardiz commented 5 years ago

@e2002e I just took a look at your "zhou" project. Looks like you're just learning how to write that code, and perhaps learning is your goal? That's fine.

FYI, John the Ripper is also able to use crypt_r if you want it to, that's called the "generic crypt(3) format". See the file c3_fmt.c. It should be fairly easy to understand.

As to your use of crypt_r in that code, you could improve speeds by keeping a struct crypt_data per thread without initializing that structure every time you call crypt_r. This shouldn't make a noticeable performance difference for e.g. sha512crypt, but it should for the old descrypt hashes where the initialization time is very significant.

zackw commented 5 years ago

@e2002e In order to investigate further, I need to know:

@solardiz The documentation says that you must zero the entire crypt_data (except the fields provided for input) when using crypt_rn, but crypt_r allows you to just clear the initialized field. This is for backward compatibility with glibc.

However, ever since I replaced UFC descrypt with BSD descrypt, I don't think we have any hashing methods that care about the previous contents of the private fields of struct crypt_data anymore.

e2002e commented 5 years ago

Hello, true @solardiz I had a measurement failure but that was just the fact that I did only use one core with john (no -fork). Here is a paste of my /proc/cpuinfo https://pastebin.com/TKa550DV and out put of: $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 6.3.0-18+deb9u1' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1) I used the bootstrap script and ./configure then make then make install

zackw commented 5 years ago

Copying the /proc/cpuinfo submitted by e2002e from pastebin to a github issue attachment.

solardiz commented 5 years ago

@e2002e You could also build JtR with OpenMP support (uncomment a line in the Makefile if you're using non-jumbo), although fork's cumulative speed is typically slightly higher than OpenMP's. Also, note that JtR jumbo has much faster sha*crypt and md5crypt that use SIMD.

zackw commented 5 years ago

OK, I can't reproduce the problem with the submitted code, current develop, GCC 8, GLIBC 2.28, and an Intel(R) Xeon(R) CPU E3-1245 v3, which means this has to be specific to the exact CPU, compiler, and/or C library you are using, @e2002e. I need you to do some assembly-level debugging for me. Run your test program under gdb and wait for it to crash. Then issue these commands and report their complete and unedited output:

(gdb) set pagination off
(gdb) backtrace
(gdb) disassemble /r 
(gdb) info registers

These commands may produce a whole lot of output; it might be a good idea to use the script utility to capture a verbatim log.

e2002e commented 5 years ago

Here we go, I couldn't open it myself but I bet you'll have no problem with that. The code compiled here is:

include

int main() { struct crypt_data datas; datas.initialized = 0; crypt_r("foob", "ar", &datas); }

issue85.zip

zackw commented 5 years ago

OK, strcpy does not appear to be involved at all. The crash is on a 32-bit integer load:

 mov    0x2009c(%rsi),%eax

The value of %rsi is 0x7fffffff6100, which is stack memory. %rsp is 0x7fffffff6080. 0x7fffffff6100 + 0x2009c = 0x80000001619c — well-aligned for a 32-bit load, but above the end of the stack. Also, this instruction comes from a function named _ufc_setup_salt_r. No such function exists in the current development sources of libxcrypt.

I think what has happened is, you somehow managed to compile your program against libxcrypt's <crypt.h> but link and run it with GNU libc's libcrypt.so. The definition of struct crypt_data in libxcrypt is not compatible with the definition in glibc's libcrypt, and that's why it's crashing.

I will think about whether there is anything I could put in our crypt.h that would prevent this from happening.

e2002e commented 5 years ago

curious, I have messed up with a lot a libraries concerning sha512crypt, must be keeping yours from installing correctly.