patrickfav / bcrypt

A Java standalone implementation of the bcrypt password hash function. Based on the Blowfish cipher it is the default password hash algorithm for OpenBSD and other systems including some Linux distributions. Includes a CLI Tool.
https://favr.dev/opensource/bcrypt
Apache License 2.0
467 stars 50 forks source link

Hash password strategy for overlong password and NUL bytes handling #27

Closed Indigo744 closed 4 years ago

Indigo744 commented 4 years ago

Hello,

This is a followup on the overlong password strategy, but this time with the hash one.

I read this article about using bcrypt for password hashing.

It suggests to pass the password through SHA384 before using bcrypt in order to circumvent the bcrypt limitation.

Your API support this approach, albeit with a SHA512.

However, looking at the code, I couldn't find any mention of base64 encoding after the hash and prior bcrypt.

base64 is suggested in the article because:

Bcrypt truncates on NUL bytes.

And

There is a nontrivial chance that one of the raw bytes in the hash will be 0x00. The sooner this byte appears in the string, the cost of finding a collision becomes exponentially cheaper. For example, both 1]W and @1$ produce a SHA-256 hash output that begins with ab00.

And

A base64-encoded hash is guaranteed to not contain NUL bytes

Here are some other implementations seen elsewhere:

So my suggestion are:

Thanks.

patrickfav commented 4 years ago

So, first of all I have my reservations on the expertise of Paragon guy, I read couple of their articles while researching another project and had some issues with their suggestions - to my understanding they are, like me, not security experts.

That being said, what Paragon is suggesting is to basically create another hashing protocol, which is fine, but cannot be a default in this lib. The current framework already supports your request. A quick and dirty solution would be to implement LongPasswordStrategy with:

    @Override
    public byte[] innerDerive(byte[] rawPassword) {
        return Bytes.from(Bytes.wrap(rawPassword).hash("SHA-384").encodeBase64()).array();
    }

Here are my issues with the suggested protocol:

Bottom line: you are, with the current implementation, free to implement the suggested protocol - but it's only feature would be interoperability with other implementations.

Indigo744 commented 4 years ago

Ah, I'm happy to see that this lib is not affected by the NULL byte issue.

However, offering more hash option (with base64) would help interoperability with the other libs I mentionned, I think.

Thanks for your reply!

patrickfav commented 4 years ago

Sure, but baking propriety hash protocols in this lib doesn't really make sense, especially if the developer can do it him/herself. If you like, I can assist you replicating other bcrypt dialects.

To reiterate the base64-ing: This doesn't really make sense from a technical or security standpoint; encoding is for either making byte-safe representation or transfer possible - this sounds like a very dirty hack for poor bcrypt implementations (no offence to the respective authors), so I dont really feel supporting this first-party.

EDIT: the main difference in view is that most bcrypt implementations use char as input, while the inner engine of this lib uses bytes, thats why Base64 just doesnt make much sense.

patrickfav commented 4 years ago

But stepping back a second: overlong passwords a really super special case in password hashing. I would say having a 70+ char password is seldom (if ever) used - I view myself as security aware, but don't really see the use case of such a long user-password. Here is an example of statistical analysis of passwords, where length 23+ is hardly ever used: https://www.codelord.net/2011/06/18/statistics-of-62k-passwords/

Not that it isn't an issue with the protocol, but it'll mostly affect your tester or QA ppl.

Indigo744 commented 4 years ago

Sure, but baking propriety hash protocols in this lib doesn't really make sense, especially if the developer can do it him/herself.

Alright, then I don't see the point of the whole password strategy thing? Just remove it altogether and let the dev handle its special flavor. At least it will be consistent.

Again, I'm not saying peppering (hash, base64, etc) is good or bad, or anything like that. This thread started as an interrogation on how this lib handled NULL byte from hash. Then the question was raised about interoperability and consistence.

I don't need any of these things, actually (the project I need this lib for uses bcrypt only, and only need the verify part). I am just offering some insight, hoping to improve your lib for future users.

From my point of view, a good lib is a consistent lib that works well with others.

Offering SHA512 flavor only while ignoring how others are doing it (SHA384, base64, etc...) is not consistent in my book.

But stepping back a second: overlong passwords a really super special case in password hashing

Of course, but it's not the topic and hand-waving it will not make it go away. If it can happen once, it will happen, hence this is a case that should be tackled properly.

Every person with a password manager can use an overly long password. I, for one, have some.

As a dev, I would be sad if a user registered through my PHP website couldn't log in my Android app because of bcrypt interoperability issue.

patrickfav commented 4 years ago

Maybe I hit the wrong tone, but my criticism was not against you, but the general issue with long passwords and homebrew dialects.

Alright, then I don't see the point of the whole password strategy thing?

The point was that I am aware that there are a bunch of dialects to handle this issues, so I gave an entrypoint to the developer to customise the intended behaviour. I didn't want to provide an exhaustive list of popular dialects. The standard implementations are merely examples, enough for anyone to overcome this issue, if a different approach is needed, it is possible to implement it.

I don't need any of these things, actually

Me neither, but you brought the point up, thats why we are discussing :) (I actually enjoy discussions about these topics)

From my point of view, a good lib is a consistent lib that works well with others.

Consistent would be to just truncate and not support any long passwords at all, since according to the original paper, no handling was defined for such a case. From that point of view hardly any of the implementations are "consistent". I'm all for providing the tools to make them interoperable, but to call it consistent is misleading I think.

Of course, but it's not the topic and hand-waving it will not make it go away.

That's why I also wrote "Not that it isn't an issue with the protocol,..."; I was merely suggesting that beeing a special case it should should get the matching attention share.


Bottom Line: I think we are drifting a bit. Im with you, that this lib should provide the tools to support interoperability with most common bcrypt implementations. However, I wouldn't want to implement any "recommended" bcrypt dialect to handle overlong passwords (as suggested in the first post) as I think it just worsen the problem with bcrypt dialect fragmentation.


EDIT: If you are interested here is my take on a modernised password hash & key derivation function based on bcrypt (unfinished) - it's not intended for productive use, more like as a research topic: https://github.com/patrickfav/bkdf

Indigo744 commented 4 years ago

Maybe I hit the wrong tone, but my criticism was not against you, but the general issue with long passwords and homebrew dialects.

I didn't take it personally, don't worry 😉 but English not being my first language, sometimes I can come across too blunt (or interpret a reply too harsh). My bad. I should use more emoticons maybe ^^

The current bcrypt landscape and its myriad of different implementation is really infuriating. The more I read into it, the more I'm baffled.

Anyway, thanks for listening 😅