narfbg / SimpleEncryption

Simple Encryption for PHP
22 stars 6 forks source link

Compute MAC Prior to base64_encoding #5

Closed ircmaxell closed 10 years ago

ircmaxell commented 10 years ago

Currently, you are computing a MAC of the base64_encodeed ciphertext and iv.

$data = base64_encode($iv.$data);
return hash_hmac('sha256', $data, $hmacKey, false).$data;

This can open up vulnerabilities because base64_decode will silently ignore characters outside of a-zA-Z0-9./=. Example:

var_dump(base64_decode(base64_encode("foo") . chr(0).chr(1).chr(128)));

will result in string(3) "foo".

Instead, MAC the ciphertext prior to encoding it (and verify the mac only after decoding it).

narfbg commented 10 years ago

Isn't the whole point in the MAC verifying that the Base64 string is not altered?

ircmaxell commented 10 years ago

Well, imagine the following situation:

An attacker changes the base64 encrypted value. But they want the MAC to still verify. So they can use length extension to add non-alpha-numeric (not valid base64 characters) to the end of the base64 encoded value. This could allow the MAC to validate even though the encrypted ciphertext has changed.

Instead, if you authenticated the ciphertext directly, the attacker would have to directly manipulate the encrypted value to get the MAC to validate. Validating the base64 value would open an attack where the attacker could modify data other than the ciphertext (thereby making it easier for them to forge the data).

narfbg commented 10 years ago

I'm almost still sleeping, but the HMAC is supposed to validate all that follows it (in this case, because it's prepended) and I'm not Base64-decoding prior to authentication. Adding non-alphanumeric characters to the end of the cipherText would surely trigger an authentication failure, so how would such an attack be successful?

ircmaxell commented 10 years ago

Well, the point is that the HMAC is validating everything inside it. The problem is that due to the ciphertext being base64 encoded, it gives the attacker characters that they can use to length-extend the MAC to get it to validate.

So in theory they can generate a new ciphertext which validates the MAC while being different (because they can vary the ciphertext independently by appending garbage to the base64_encoded text using a tool similar to HashPump

narfbg commented 10 years ago

Quoting the Wikipedia article that you've linked:

Note that since HMAC doesn't use the construction H(key ∥ message), HMAC hashes using susceptible algorithms are not prone to length extension attacks

A random article from a google search result on "hash length extension attack" specifically notes usage of HMAC as the solution to the problem (note that I'm using hash_hmac(), not a raw hash).

So, I'm more concerned about the possible drawbacks of not covering the Base64 string (or IV/salt, issue #4) with a HMAC. I'm not aware of a practical attack based on that, but it's not hard to imagine given that it would be under the attacker's control.

ircmaxell commented 10 years ago

That's fair. However that's still not an excuse to give an attacker free characters that count towards the MAC, but don't count towards the ciphertext.

So, I'm more concerned about the possible drawbacks of not covering the Base64 string

Which are?

What you want to be doing is EtM (Encrypt Then MAC). This is specified in ISO/IEC 19772:2009 as using a MAC directly on the ciphertext output from the encryption function. Not the ciphertext with IV, not the ciphertext encoded, the raw ciphertext.

narfbg commented 10 years ago

That's the problem, I don't know if there are possible drawbacks and so I'm trying to be safe by authenticating all of it.

If your suggestion is specified in a standard though, it must be safe... I'll look into it tomorrow.

Sorry about questioning you that much, especially since I asked for help. Just trying to understand it clearly. :-)

defuse commented 10 years ago

If the entire base64 string is encoded, I don't see either how an attacker could modify the base64 without detection (even characters ignored by base64_decode would get caught, because the HMAC which is checked before decoding doesn't catch that). It's the other way around: If you HMACed only the raw ciphertext+IV, then an attacker could add things to the base64 string.

But yes, you do want to authenticate the raw ciphertext+IV, not the encoded one. An attacker could add stuff to the base64 but it really doesn't matter, AFAIK there is no security impact, since it's (it will be) all stripped away first-thing in the decrypt function.

defuse commented 10 years ago

s/encoded/HMACed/