haskell-crypto / cryptonite

lowlevel set of cryptographic primitives for haskell
Other
226 stars 139 forks source link

Supporting personalization for Blake2b hashes #333

Closed nuttycom closed 1 year ago

nuttycom commented 4 years ago

I'm trying to figure out how I can possibly pass personalization data in to blake2b hash initialization. At present, all of the blake2xb* C code memsets the personalization field to all zeros. I started working on a patch that makes it possible to pass in the 16-byte personalization string, but then I ran into the problem that there appears to be no way to implement HashAlgorithm for a type which will permit me to provide this data; hashAlgorithmInit does not take any argument that allows end-user specialization of the context.

Is it really the case that I have to construct my own type, for which the HashAlgorithm instance hardcodes the personalization information? That seems exceptionally limiting. For that matter, the type-level fixing of digest size leads to a ton of needless duplication in the blake2b implementation, and means that for one of my use cases (which requires a 50-byte digest) I need to implement an entirely separate type. What is the reason for HashAlgorithm being so limiting?

The signature of hashInitWith suggests an intention of being able to support algorithms that expose end-user-configurable hashes, but then the implementation is just const hashInit.

nuttycom commented 4 years ago

Equally puzzling to me, the HashDigestSize type doesn't appear to be used anywhere. So it locks down to statically knowable hash sizes, but doesn't use those static assurances for anything? What's the purpose of this associated type?

vincenthz commented 4 years ago

the problem with the personalisation bits, is that it's break the assumption that a given hash supported by HashAlgorithm is self describing. i.e. you can create it out of thin air at any time.

The functionality you're looking for, is closer to a Mac algorithm than a HashAlgorithm.

note that on the duplication of blake2b_* types, this is handled at the code generation level by the gen/ machinery:

    , GenHashModule "Blake2b"   "blake2.h"    "blake2b"   248  (HashMulti [] [(160, 128), (224, 128), (256, 128), (384, 128), (512,128)])

But going forward all size of blake2 are supported by the Blake2b / Blake2s, the Blake2b of 50 bytes digest is just:

Blake2b 400

As to the personalisation, you need to create your own trait, e.g. HashInitWithPerso that will allow to create a hash context with the personalisation string you want, whilst also only defining instances for the algorithms that allow such a thing (which is the exceptions, not the norm):

class HashAlgorithm h => HashInitPerso h where
    hashInitWith :: ... -> h

instance (IsDivisibleBy8 bitlen, KnownNat bitlen, IsAtLeast bitlen 8, IsAtMost bitlen 512) => HashInitPerso (Blake2b bitlen) where
    hashInitWith ...
vincenthz commented 4 years ago

forgot to reply to this:

The signature of hashInitWith suggests an intention of being able to support algorithms that expose end-user-configurable hashes, but then the implementation is just const hashInit.

no, hashInitWith is just a helper way to specify the algorithm before TypeApplication existed; it is effectively superseded by:

{-# LANGUAGE TypeApplications #-}
hashInit @Hash 
nuttycom commented 4 years ago

@vincenthz thanks for your response. Something that occurred to me as possibility is this:

+data Blake2bPers (bitLen :: Nat) (personal :: Symbol) = Blake2bPers
+    deriving (Show, Data)
+
+instance (IsDivisibleBy8 bitlen, KnownNat bitlen, IsAtLeast bitlen 8, IsAtMost bitlen 512, KnownSymbol personal)
+      => HashAlgorithm (Blake2bPers bitlen personal)

This seems like it could fit into the existing framework? The personalization string is limited to 16 bytes, and then you get the type-level guarantees. Same goes for the salt; a 16-character symbol in the type doesn't seem so bad. Would you consider a PR to this effect?

vincenthz commented 4 years ago

yes, that could works also. the only problem I think is that you cannot constraint your Symbol length easily. ghc's magic typelits lacks a Symbol to Length (probably type family).

edit: also potentially another problem is that it's a unicode string not a byte string

nuttycom commented 4 years ago

The personalization strings in the hashes for my use case are ASCII, but in terms of a general solution maybe the right thing would be to interpret those strings as hex, and raise an error in IO on init if they're invalid?