kazuho / picohash

header-file-only implementation of various hash algorithms
89 stars 11 forks source link

gcc11+ exposes aliasing/type punning bug #8

Open melak opened 2 years ago

melak commented 2 years ago

I have recently updated https://github.com/melak/openvpn_radauth/blob/master/libradius/radlib.c to use picohash for its MD5ing and HMAC-MD5ing needs (thanks!).

There is one tiny snag - it appears when compiled with GCC 11.2 with any optimisation level beyond -O1, something doesn't work (and I don't know what it is). My test system exhibiting the problem is Ubuntu 22.04/x86_64. It works perfectly with Clang 14 with any optimisation level, it works perfectly with GCC 11.2 with -O1 or below, it does not work with GCC 11.2 -O2, -O3 or -Os.

It also works fine on various older Linux distros and FreeBSD, where the compiler is either older than GCC11 or isn't GCC at all.

How to repeat (it's fairly low-effort and requires only one or two Perl packages beyond a minimal development environment):

I don't know if this is a bug in GCC11, more broadly with whatever Ubuntu 22.04 is shipping in various components and things, or an issue with picohash.

melak commented 2 years ago

Here's a self-contained test case:

#include <stdio.h>
#include "picohash.h"

void picohash_hmac_test(char *key, int keylen, char *data, int datalen);
void picohash_md5_test(char *data, int datalen);

int main(int argc, char **argv)
{
    printf("picohash HMAC:\n");
    picohash_hmac_test(
        "\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b",
        16,
        "Hi There",
        8
    );

    printf("picohash MD5:\n");
    picohash_md5_test("The quick brown fox jumps over the lazy dog", 43);
}

void picohash_hmac_test(char *key, int keylen, char *data, int datalen)
{
    picohash_ctx_t ctx;
    char digest[PICOHASH_MD5_DIGEST_LENGTH];

    picohash_init_hmac(&ctx, picohash_init_md5, key, keylen);
    picohash_update(&ctx, data, datalen);
    picohash_final(&ctx, digest);

    printf("0x");
    for(int i = 0; i < PICOHASH_MD5_DIGEST_LENGTH; i++)
    {   
        printf("%02x", digest[i] & 0xff);
    }
    printf("\n");
}

void picohash_md5_test(char *data, int datalen)
{
    picohash_ctx_t ctx;
    char digest[PICOHASH_MD5_DIGEST_LENGTH];

    picohash_init_md5(&ctx);
    picohash_update(&ctx, data, datalen);
    picohash_final(&ctx, digest);

    printf("0x");
    for(int i = 0; i < PICOHASH_MD5_DIGEST_LENGTH; i++)
    {
        printf("%02x", digest[i] & 0xff);
    }
    printf("\n");
}

This results in:

$ cc --version
cc (Ubuntu 11.2.0-18ubuntu1) 11.2.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ cc -Wall pht.c -o pht
$ ./pht                                          
picohash HMAC:
0x9294727a3638bb1c13f48ef8158bfc9d
picohash MD5:
0x9e107d9d372bb6826bd81d3542a419d6

$ cc -Wall pht.c -o pht -Os
$ ./pht                                          
picohash HMAC:
0x82bdcd4c23d4d4d210923ced75ba19ef
picohash MD5:
0xefcacc391df1ac98ed73e418e7c22bcc

Compiled with Clang, the issue is not exhibited:

$ clang --version
Ubuntu clang version 14.0.0-+rc4-1ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

$ clang -Wall pht.c -o pht
$ ./pht                                           
picohash HMAC:
0x9294727a3638bb1c13f48ef8158bfc9d
picohash MD5:
0x9e107d9d372bb6826bd81d3542a419d6

$ clang -Wall pht.c -o pht -Os
$ ./pht                                          
picohash HMAC:
0x9294727a3638bb1c13f48ef8158bfc9d
picohash MD5:
0x9e107d9d372bb6826bd81d3542a419d6
melak commented 2 years ago

GCC-12 exhibits the problem as well. Different optimisation levels produce different results. Slightly different versions of the compiler on different systems (i.e. I asked a mate to run the test on his system with a slightly different version of gcc11) produce different results.

melak commented 2 years ago

-O1 -fstrict-aliasing seems to be one reduced-to-minimum option set exhibiting the problem. It also makes the result to be wrong the same way across different compiler versions.

melak commented 2 years ago

Consequently, -Os -fno-strict-aliasing also fixesmasks the problem.

melak commented 2 years ago

To be fair, this looks more and more like a GCC bug.

melak commented 2 years ago

Maybe not. It seems to be specific to the MD5 routines in picohash (neither sha1 nor sha256 exhibit the problem).

melak commented 2 years ago

Replacing uint_fast32_t with uint32_t in picohash.h also seems to be at least masking the problem. Note that uint_fast32_t is 8 bytes, while uint32_t is 4 bytes on this particular architecture. I am hazarding a guess this has something to do with

#define PICOHASH_MD5_DIGEST_LENGTH 16
typedef struct {
    uint_fast32_t lo, hi;
    uint_fast32_t a, b, c, d;
    unsigned char buffer[64];
    uint_fast32_t block[PICOHASH_MD5_DIGEST_LENGTH];
} _picohash_md5_ctx_t;

and

 * The check for little-endian architectures which tolerate unaligned
 * memory accesses is just an optimization.  Nothing will break if it
 * doesn't work.
 */
#define _PICOHASH_MD5_SET(n)                                                                                                       \
    (ctx->block[(n)] = (uint_fast32_t)ptr[(n)*4] | ((uint_fast32_t)ptr[(n)*4 + 1] << 8) | ((uint_fast32_t)ptr[(n)*4 + 2] << 16) |  \
                       ((uint_fast32_t)ptr[(n)*4 + 3] << 24))

Well, it appears something has?

melak commented 2 years ago

This seems to also be (somewhat) supported by the fact that the MD5 hash of zero-length data is correct either way.

melak commented 2 years ago

Oh, I only just now noticed this: https://github.com/kazuho/picohash/blob/master/picohash.h#L141

The code path active on x64 says uint32_t. The actual data types are uint_fast32_t. They are different in size. There's the aliasing bug.

Str1ker17 commented 1 year ago

Thank you very much! Just encountered the same problem with GCC 11 on Ubuntu 22 on the older version of picohash.

Guess the issue can be closed as the author had invented the fix: https://github.com/kazuho/picohash/commit/3a9eff997553fb6547d38763628f8c9f92fcda19