haskell-crypto / cryptonite

lowlevel set of cryptographic primitives for haskell
Other
226 stars 139 forks source link

Add equivalent of OpenSSL's EVP_BytesToKey (e.g. generate key and IV from some password) #214

Open mgmeier opened 6 years ago

mgmeier commented 6 years ago

Maybe I've overlooked it in the library, but would that be a feature worth adding?

For my current project, I adapted some code from here: https://hackage.haskell.org/package/shadowsocks-1.20151028/src/Shadowsocks/Encrypt.hs

crypter :: ByteString -> ByteString -> ByteString
crypter password =
    cbcEncrypt (cipher :: AES256) iv . pad (PKCS7 16)
  where
    (key, iv_)          = evpBytesToKey password 32 16      -- key size, iv size for AES256
    Just iv             = makeIV iv_
    CryptoPassed cipher = cipherInit key

-- Haskell implementation of OpenSSL's EVP_BytesToKey
-- generate key and iv from a given password
evpBytesToKey :: ByteString -> Int -> Int -> (ByteString, ByteString)
evpBytesToKey password keyLen ivLen =
    let ms' = B.concat $ ms 0 []
        key = B.take keyLen ms'
        iv  = B.take ivLen $ B.drop keyLen ms'
    in (key, iv)
  where
    hash' :: ByteString -> ByteString
    hash' bs = convert (Crypto.hash bs :: Digest MD5)

    ms :: Int -> [ByteString] -> [ByteString]
    ms 0 _ = ms 1 [hash' password]
    ms i m
        | B.length (B.concat m) < keyLen + ivLen =
            ms (i+1) (m ++ [hash' (last m <> password)])
        | otherwise = m

Being just a quick hack, I see some things that would make it a good fit for adding it to cryptonite, like e.g.:

What do you think?

ocheron commented 6 years ago

Not worth it IMO: function size is small, especially after generalizing, and this is a legacy algorithm.

mgmeier commented 6 years ago

Hmm... I see. Function size doesn't seem like a hard criterion though, given that there are helpers like Crypto.Cipher.Utils.validateKeySize, having their own module no less.

The algorithm being deprecated is much more significant. I didn't actually know that, I've seen it being used all over the place... out of curiosity, where and when did that happen? I can only infer that from looking at openssl's source code... :(

However for future inclusion, or as a reference for anyone looking for an implementation, I've haskellified and optimized the above version a little and added parameters for salt usage and hash algorithm. I did not however integrate it with the Cipher type class; there'd need to be a wrapper function getting key and IV size for some cipher, generating a random salt, process everything and init the cipher's context according to all that.

evpBytesToKey :: HashAlgorithm alg =>
    Int -> Int -> alg -> Maybe B.ByteString -> B.ByteString -> (B.ByteString, B.ByteString)
evpBytesToKey keyLen ivLen alg mSalt password =
    let bytes       = B.concat . take required . iterate go $ hash' passAndSalt
        (key, rest) = B.splitAt keyLen bytes
    in (key, B.take ivLen rest)
  where
    hash'       = convert . hashWith alg
    required    = 1 + ((keyLen + ivLen - 1) `div` hashDigestSize alg)
    passAndSalt = maybe password (password <>) mSalt
    go          = hash' . (<> passAndSalt)
mgmeier commented 6 years ago

Of course, should you decide on including this as a utility function, I'd be willing to prepare a proper PR (using ByteArray, and providing sensible cipher context initialization).

ocheron commented 6 years ago

The implementation does not need two independent lengths keyLen and ivLen. Only the sum matters. You can get more applicability letting the caller do the splitting. Because deriving the IV from the same password than the key is not always the best option.

To me the algorithm looks like a non-standard extension of PBKDF1.

I tend to compare the size of the function to the size of the documentation explaining when (not) to use it :)

mgmeier commented 6 years ago

The function is not intended for use "as is" for cryptonite - I've already pointed that out above. It's current signature is just modeled after the corresponding call in OpenSSL. So, yes, of course the split should be done by the function initializing the cipher context from some password, since the offsets depend on which cipher is used. The goal is to end up not with ByteStrings, but with CryptoFailable.

Because deriving the IV from the same password than the key is not always the best option.

Agreed. It's not a very strong derivation mechanism. Again, I only intended to provide a native (i.e. without binding and depending on the library) way to process ciphertext that's been generated by software that uses OpenSSL's equivalent function, or that expects such ciphertext from your software. As I hinted, I've (sadly) encountered that numerous times, e.g. with payment platforms on the web.

I tend to compare the size of the function to the size of the documentation explaining when (not) to use it.

You're not making any sense. Is the function size measured in x86 machine instructions? The documentation size in UTF8 codepoints or natural language words, and in which natural language?On what grounds is such a comparison even useful and/or justified? Sorry for not getting the joke if there was one...

So are you genereally interested in providing a utility function for cipher initialization from a passphrase? I'm happy to discuss implementation details (like the splitting, see above) once you've decided, to guarantee the solution be a good fit for cryptonite's API design and coding style.

ocheron commented 6 years ago

Sorry if this is confusing, I was just trying to save effort here (which is the real measure). I still think it's best not to put spotlight and copy only where needed. The function has been deprecated for quite some time and I don't see it in many other libraries.