mitsuhiko / python-pbkdf2

Because pbkdf2 is awesome and bcrypt is overkill
Other
149 stars 29 forks source link

Segmentation Fault with high iterations #2

Open dfc opened 12 years ago

dfc commented 12 years ago

I am getting a segmentation fault with:

print pbkdf2.pbkdf2_hex('asdfcasdfc','asdfasd',99999,20)

The commented out test from the RFC also seg faults:

print pbkdf2.pbkdf2_hex('password', 'salt', 16777216, 20)

If you are interested I put a gdb backtrace on:

https://gist.github.com/2924570

mitsuhiko commented 12 years ago

That's because it's recursively chaining iterators and cpython has a crappy detection for when it runs out of stack space. Quickfix is adding a list() call around the starmap call, but it will make it waste more memory I would assume. I would claim that's a cpython bug that it segfaults and a bug in this module that it recurses so much.

Not sure if this should be fixed however.

warner commented 11 years ago

I just ran into this same problem. The threshold was about 88000 iterations, depending on the platform.

I'd argue that you should add that list() wrapper. It will actually use less memory, because the list in question is very small (32 bytes, the output of the hash function), and flattening the list after each iteration makes it forget about all those earlier iterators. Adding list() also makes it run about 5% faster.

It's fair to say that cpython should handle this better: it should throw a StackOverflowError rather than a segfault. But either error will prevent us from using PBKDF2 with more iterations :).

I'll throw together a one-liner pull request.

OriginalPenguin commented 11 years ago

This fix really should be applied and the library should be re-released.

If only to save other people like myself from having to spend an hour figuring out this issue exists. :)

Also as far as I can tell the best practice is to use sha256. Is that incorrect? If so, it should default to sha256, not sha1. Of course, if it's better to use sha1, then carry on!

warner commented 11 years ago

sha1 and sha256 are roughly equivalent for this purpose: I'd bet they have about the same ratio of (how fast can you run it) vs (how fast can the attacker's GPU farm can run it). sha256 will be slightly slower for any given iteration count, so you may want to tweak your settings to meet some fixed latency goal.

The biggest difference is that sha256 will give you 32 bytes of output for each unit of work, whereas sha1 will only give you 20. When you ask PBKDF2 for more than (hashsize) output bytes, it repeats the overall derivation enough times to give you the extra bytes. So asking PBKDF2-SHA1 for 21 (or 22 or anything up to 40) bytes will take twice as long as asking it for 1..32 bytes. PBKDF2-SHA256 for 33..64 bytes takes twice as long as 1..32 bytes.

OriginalPenguin commented 11 years ago

Hi Brian...

Thank you for the message.

I'm using:

The server takes 1.4 seconds to run pbkdf2.pbkdf2_hex

Does that seem ideal? Or would you recommend something else?

warner commented 11 years ago

Yeah, that sounds pretty good. If you're using sha256, you might as well use the whole 32-byte key (I like to standardize on 256-bits for all cryptographic values).

You might also be interested in http://keywrapping.appspot.com/ , which is a small calculator I put together for estimating the brute-forcing costs for PBKDF2 and scrypt (pronounced s-crypt). It's specific to my work project, but if you turn off the scrypt part, you can get some useful results for just the PBKDF2 phase.

snoack commented 7 years ago

I ran into the same issue, and before I noticed the discussion here, I filed a Python bug.

Anyway, 200,000 iterations is pretty much standard nowadays, and much less can no longer be considered secure. So IMO this module currently, while it attempts to serve a security need, is actually making security worse, by enforcing unsafe presets and causing crashes.

Then again, the question is how useful a PBKDF2 implementation in Python is anyway. With a reasonable amount of iterations, it takes several seconds even on modern hardware, which is much longer than you are willing to wait to unlock your password manager for example. On the other hand, an implementation with the loop in native code, can easily achieve reasonable performance and good security at the same time.