openwall / crypt_blowfish

Implementation of bcrypt password hashing scheme
https://www.openwall.com/crypt/
5 stars 2 forks source link

Problem: Documentation 1) poorly documented and 2) unclear as to practical usability #1

Open WarrenN1 opened 1 year ago

WarrenN1 commented 1 year ago

I am struggling with finding examples for OpenWall's bcrypt implementation that use crypt_gensalt_rn and crypt_r that also explain in depth exactly what is happening in terms of input, settings etc and more generally the cryptographic component. crypt and crypt_gensalt are not really viable due to them not being MT-Safe so I am trying to better understand the rn, ra, and r implementations. For example, for salt generation using crypt_gensalt_rn it seems like it takes an input to seed the salt generator, but the problem is that the seed should be a cryptographically secure random sequence, but from the documentation I could not find an indication of length for said input or a reference other than to use dev/random, even though with multiple users that is not sustainable

solardiz commented 1 year ago

Hi. Thank you for the feedback/criticism.

The currently available documentation is the man page, which documents the generic API and is not limited to bcrypt. It does also have a section about bcrypt, where it says the salt size for bcrypt is 128 bits, so that's the input size you need to provide to crypt_gensalt* when using bcrypt. We recommend that you read 16 bytes from /dev/urandom (not /dev/random, which used to be blocking) per each call to crypt_gensalt*. This is sustaintable.

On a related note, crypt_blowfish is now integrated into https://github.com/besser82/libxcrypt, which provides these functions as standard on recent Linux distros. In libxcrypt, they have also been enhanced such that they're able to retrieve the randomness from the operating system on their own.

Finally, I don't know what your use case is, but for large deployments (e.g., 1M+ users) we recommend our newer yescrypt, preferably with its ROM feature (libxcrypt also includes yescrypt, but without the ROM).

WarrenN1 commented 1 year ago

Hello. The problem is I checked their documentation and the man pages. My problems are still dealing with ambiguity and lack of clarity in the man pages. For example:

  1. The salt size is 128 bits (16 bytes). However, "crypt_gensalt_rn" returns a setting in the form of: $2a$$ for example. So in this case, would the number of bytes I need for the setting array be 7+128/8 = 23 as I would need 7 bytes for storing the $2a$$ part and another 16 for the salt? Or is the 128 bits representative of the entire setting string. Then again, in the code itself it says that CRYPT_GENSALT_OUTPUT_SIZE is defined as:

    #define CRYPT_GENSALT_OUTPUT_SIZE 192

    which would imply 192 bytes?

  2. "crypt_ra" allocates and creates a "struct crypt" object. However, there is no detail as to the specifics of the bcrypt struct crypt. My current understanding it is simply boilerplate so that multiple crypt implementations can be standardized.

    On the first call to crypt_ra, data should be the address of a void * variable set to NULL, and size should be the address of an int variable set to zero.  crypt_ra will allocate and initialize a crypt_data object, using malloc(3), and write its address and size  into
       *data and *size.  These can be reused in subsequent calls.  After the application is done hashing passphrases, it should deallocate *data using free(3).

    Currenty the returned size of this crypt_ra call per the man pages is 32768. I have read online that bcrypt produces hashes of length 60 (ie. 23 bytes for the (settings+rounds+salt+hash), so that makes me believe that the implied hash length is 37 bytes, even though I cannot find that anywhere in the man pages. So I would take in this case the first 60 bytes and store that, but I want a bit more assurance that this is the case than random stack overflow and mysql forum posts before I implement this for users as I am using the C/C++ built in library.

Any help/guidance you could give would be deeply appreciated as I want to understand the crypto library I am using before I simply copy paste code from the internet. I want to make sure my (hopefully) future users feel secure and confident in the cryptography being used.

I got all the magic numbers from https://github.com/besser82/libxcrypt/blob/develop/lib/crypt.h.in

solardiz commented 1 year ago
  1. The salt size is 128 bits (16 bytes).

This is the input randomness size - the number of bytes you'd read from /dev/urandom and pass into crypt_gensalt* as input (called rbytes in libxcrypt man pages).

With libxcrypt, you can also simply pass NULL for rbytes and 0 for nrbytes. It'll take care of retrieve the right number of random bytes on its own. (A feature not present in our crypt_blowfish here, but added in libxcrypt.)

define CRYPT_GENSALT_OUTPUT_SIZE 192

This is the maximum output string size of crypt_gensalt*, and a comment says this is indeed generous (actual strings are generally much shorter). If you use crypt_gensalt_rn from libxcrypt, you should also use their CRYPT_GENSALT_OUTPUT_SIZE as the allocation size of output.

Alternatively, you can use crypt_gensalt_ra and it'll take care of the allocation for you, but you'll be expected to free that memory.

Here's example usage in our https://github.com/openwall/tcb:

        char *salt;

#ifdef CRYPT_GENSALT_IMPLEMENTS_AUTO_ENTROPY
        salt = crypt_gensalt_ra(pam_unix_param.crypt_prefix,
            pam_unix_param.count, NULL, 0);
#else
        char entropy[16];
        int fd;

        fd = open("/dev/urandom", O_RDONLY);
        if (fd < 0) {
                pam_syslog(pamh, LOG_CRIT, "open urandom: %m");
                return NULL;
        }
        if (read_loop(fd, entropy, sizeof(entropy)) != sizeof(entropy)) {
                pam_syslog(pamh, LOG_CRIT, "read urandom: %m");
                close(fd);
                return NULL;
        }
        close(fd);

        salt = crypt_gensalt_ra(pam_unix_param.crypt_prefix,
            pam_unix_param.count, entropy, sizeof(entropy));

        memset(entropy, 0, sizeof(entropy));
#endif

        if (!salt) {
                pam_syslog(pamh, LOG_CRIT, "crypt_gensalt_ra: %m");
                return NULL;
        }

"crypt_ra" allocates and creates a "struct crypt" object. However, there is no detail as to the specifics of the bcrypt struct crypt. My current understanding it is simply boilerplate so that multiple crypt implementations can be standardized.

Exactly. This is an opaque object that you should not access directly anyway.

Currenty the returned size of this crypt_ra call per the man pages is 32768.

Not exactly. It says: "struct crypt_data may be quite large (32kB in this implementation of libcrypt;" - this refers to the maximum allocation that this implementation can make (the API is such that the allocation can grow between calls if a more space-demanding hashing method is used after a less-demanding one).

The man page does say that "crypt_r, crypt_rn, and crypt_ra place their result in the output field of the crypt_data object" (wording in libxcrypt; here it is different). Maybe it's wrong that this detail is exposed in the man page, @besser82, and is part of what caused the confusion in @WarrenN1's comment above.

So I would take in this case the first 60 bytes and store that

While that would probably work with libxcrypt, it is not the way to go. Please just use the pointer returned by crypt_ra - the function's return value - and store the NUL-terminated string from there. Yes, it will happen to be 60 characters (plus NUL) in case of bcrypt (and no failure), but your code probably doesn't need to know that in advance.

Again, here's example usage in our tcb:

static char *crypt_wrapper_ra(pam_handle_t *pamh, const char *key,
    const char *salt)
{
        char *retval;
        void *data = NULL;
        int size = 0;

        retval = crypt_ra(key, salt, &data, &size);
        if (retval)
                retval = strdup(retval); /* we return NULL if strdup fails */
        else
                pam_syslog(pamh, LOG_CRIT, "crypt_ra: %m");
        if (data) {
                memset(data, 0, size);
                free(data);
        }
        return retval;
}

(This one doesn't take advantage of possibly reusing the allocation across multiple calls.)