paragonie / ciphersweet

Fast, searchable field-level encryption for PHP projects
https://ciphersweet.paragonie.com
Other
439 stars 32 forks source link

Does derived keys actually need to be result of a HKDF algorithm? #106

Open etkaar opened 2 months ago

etkaar commented 2 months ago

This article – I don't know if its assumptions are correct – brought me to a question I already had: The derived keys, are they actually required to be a result of a HKDF algorithm, or can they be result of a HMAC? CipherSweet derives its keys using Util::HKDF, which uses PHP's \hash_hkdf():

public static function HKDF(
    #[\SensitiveParameter]
    SymmetricKey $key,
    #[\SensitiveParameter]
    ?string $salt = '',
    #[\SensitiveParameter]
    string $info = '',
    int $length = 32,
    string $hash = 'sha384'
): string {
    return \hash_hkdf($hash, $key->getRawKey(), $length, $info, (string) $salt);
}

The $info argument is – so far not surprising – used for domain seperation. But whats very interesting is, that the article above states the following:

Unfortunately, anyone who ever does something like this [e.g. using the info argument for domain separation] just violated one of the core assumptions of the HKDF security definition and no longer gets to claim “KDF security” for their construction. Instead, your protocol merely gets to claim “PRF security”.

The article states that the info must not be constant:

Which means: You’re not supposed to use HKDF with a constant IKM, info label, etc. but vary the salt for multiple invocations. The salt must either be a fixed random value, or NULL.

The conclusion is that instead of the info argument the salt argument should be used for domain separation and the info should contain the salt and the domain separation data:

It may seem weird, and defy intuition, but the correct way to introduce randomness into HKDF as most developers interact with the algorithm is to skip the salt parameter entirely (either fixing it to a specific value for domain-separation or leaving it NULL), and instead concatenate data into the info parameter.

I personally can't evalute this, but I indeed questioned if for key derivation a HKDF is required at all or if a HMAC instead would be more appropriate. So the one question is if the author is right and if yes, if a HMAC would be better here.