nothings / stb

stb single-file public domain libraries for C/C++
https://twitter.com/nothings
Other
26.81k stars 7.72k forks source link

stb_ds.h fails unit tests gcc 11.3.0 #1406

Closed Vaxeral closed 1 year ago

Vaxeral commented 1 year ago

I tired running the unit test after getting unexpected results when compiling with address sanitizer. It failed one test. main.c

#define STB_DS_IMPLEMENTATION
#define STBDS_UNIT_TESTS
#include "stb_ds.h"

int main(int argc, char const *argv[])
{
    stbds_unit_tests();
    return 0;
}

gcc main.c

a.out: src/stb_ds.h:1847: stbds_unit_tests: Assertion `hmgets(map3, s.key).d == i*5' failed.

I expected a piece in my game to move to its valid location, but it did not. I suspect it was a failed call to hmgeti(game->valid_actions, *action) != -1. The bug manifests itself in several ways. If i compile with no -fsanitize=address it appears to work. I checked the hashmap by iterating over it and it did have my entry but when using hmgeti it does not find it.

Relevant structures.

typedef struct Position {
    int q;
    int r;
    int s;
} Position;

typedef struct Action {
    char piece;
    Position source;
    Position destination;
} Action;

typedef struct ActionHashMapEntry {
    Action key;
} ActionHashMapEntry;

#define COLUMNS 127
#define ROWS 127
#define LAYERS 7
#define SPACES (COLUMNS * ROWS * LAYERS)

typedef struct Game {
    Position positions[PIECES];
    char board[SPACES];
    int current_turn;
    int pieces_in_play;
    Action *history;
    ActionHashMapEntry *valid_actions;
} Game;

I Create a hashmap of ActionHashMapEntrys.

Game game = {0}; // Init game.valid_actions to NULL

I insert and query for ActionHashMapEntrys

Action action = {piece, piece_position, adjacent_position};
hmputs(game->valid_actions, (ActionHashMapEntry){action});

hmgeti(game->valid_actions, *action) != -1;

I free and resuse the hashmap.

hmfree(game.valid_actions);

for(int i = 1; i < PIECES; ++i)
{
    game_generate_moves_for_piece(&game, i);
}

I never pass the pointer to the hashmap directly it is always referenced by the Game structure.

Vaxeral commented 1 year ago
typedef struct __attribute__((__packed__))  {
    char piece;
    Position source;
    Position destination;
} Action;

I have to pack it for it to pass through valgrind.

Vaxeral commented 1 year ago

I can't tell if i am using the library wrong.

Vaxeral commented 1 year ago

src/stb_ds.h:1082:54: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'

I get this error message when compiling with gcc src/*.c -o bin/a.out -g -fsanitize=undefined -lncursesw

Could it be that since a sanitizer detects undefined behaviour it screws with the results?

N-R-K commented 1 year ago

LShift in C standard is defined in terms of exponent ^0. For unsigned types, it gets reduced via modulo but for (non-negative) signed types the result must fit in that type (in this case INT_MAX) otherwise the behavior is undefined.

You can fix that issue via a cast:

   for (i=0; i+sizeof(size_t) <= len; i += sizeof(size_t), d += sizeof(size_t)) {
-    data = d[0] | (d[1] << 8) | (d[2] << 16) | (d[3] << 24);
-    data |= (size_t) (d[4] | (d[5] << 8) | (d[6] << 16) | (d[7] << 24)) << 16 << 16; // discarded if size_t == 4
+    data = d[0] | (d[1] << 8) | (d[2] << 16) | ((size_t)d[3] << 24);
+    data |= (size_t) (d[4] | (d[5] << 8) | (d[6] << 16) | ((size_t)d[7] << 24)) << 16 << 16; // discarded if size_t == 4

But I highly doubt that's having an effect on the runtime behavior (should still be fixed regardless).

Vaxeral commented 1 year ago

That is interesting. Thanks for sharing. If you all need any more information or clarification just ask. I was not able to reproduce the error in my application in a minimal example so it may still be something wrong on my end.

Vaxeral commented 1 year ago

When compiling my application with gcc src/*.c -o bin/a.out -g -fsanitize=undefined -lncursesw and running with valgrind I get several of these messages and I do not get the expected output ie my piece does not move do to the hmgeti function not finding the entries i gave it.

==19966== Memcheck, a memory error detector
==19966== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==19966== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==19966== Command: ./a.out
==19966== 
==19966== Conditional jump or move depends on uninitialised value(s)
==19966==    at 0x11BC2D: stbds_hmput_key (stb_ds.h:1383)
==19966==    by 0x1264E4: game_generate_moves_for_queen (busy.c:586)
==19966==    by 0x1274E1: game_generate_moves_for_piece (busy.c:643)
==19966==    by 0x12BD5D: main (main.c:172)
==19966== 
==19966== Conditional jump or move depends on uninitialised value(s)
==19966==    at 0x11C384: stbds_hmput_key (stb_ds.h:1393)
==19966==    by 0x1264E4: game_generate_moves_for_queen (busy.c:586)
==19966==    by 0x1274E1: game_generate_moves_for_piece (busy.c:643)
==19966==    by 0x12BD5D: main (main.c:172)
==19966== 
==19966== Conditional jump or move depends on uninitialised value(s)
==19966==    at 0x11BDD8: stbds_hmput_key (stb_ds.h:1394)
==19966==    by 0x1264E4: game_generate_moves_for_queen (busy.c:586)
==19966==    by 0x1274E1: game_generate_moves_for_piece (busy.c:643)
==19966==    by 0x12BD5D: main (main.c:172)
==19966== 
==19966== Conditional jump or move depends on uninitialised value(s)
==19966==    at 0x11BE32: stbds_hmput_key (stb_ds.h:1394)
==19966==    by 0x1264E4: game_generate_moves_for_queen (busy.c:586)
==19966==    by 0x1274E1: game_generate_moves_for_piece (busy.c:643)
==19966==    by 0x12BD5D: main (main.c:172)
==19966== 
==19966== Use of uninitialised value of size 8
==19966==    at 0x11BE58: stbds_hmput_key (stb_ds.h:1394)
==19966==    by 0x1264E4: game_generate_moves_for_queen (busy.c:586)
==19966==    by 0x1274E1: game_generate_moves_for_piece (busy.c:643)
==19966==    by 0x12BD5D: main (main.c:172)

...

==19966== 
==19966== 
==19966== HEAP SUMMARY:
==19966==     in use at exit: 234,796 bytes in 444 blocks
==19966==   total heap usage: 513 allocs, 69 frees, 325,321 bytes allocated
==19966== 
==19966== LEAK SUMMARY:
==19966==    definitely lost: 0 bytes in 0 blocks
==19966==    indirectly lost: 0 bytes in 0 blocks
==19966==      possibly lost: 2,640 bytes in 19 blocks
==19966==    still reachable: 232,156 bytes in 425 blocks
==19966==         suppressed: 0 bytes in 0 blocks
==19966== Rerun with --leak-check=full to see details of leaked memory
==19966== 
==19966== Use --track-origins=yes to see where uninitialised values come from
==19966== For lists of detected and suppressed errors, rerun with: -s
==19966== ERROR SUMMARY: 552 errors from 57 contexts (suppressed: 0 from 0)

stbds_hmput_key (stb_ds.h:1383)

// we iterate hash table explicitly because we want to track if we saw a tombstone
  {
    size_t hash = mode >= STBDS_HM_STRING ? stbds_hash_string((char*)key,table->seed) : stbds_hash_bytes(key, keysize,table->seed);
    size_t step = STBDS_BUCKET_LENGTH;
    size_t pos;
    ptrdiff_t tombstone = -1;
    stbds_hash_bucket *bucket;

    // stored hash values are forbidden from being 0, so we can detect empty slots to early out quickly
    if (hash < 2) hash += 2; // line: 1383

    pos = stbds_probe_position(hash, table->slot_count, table->slot_count_log2);

    for (;;) {
      size_t limit, i;
      STBDS_STATS(++stbds_hash_probes);
      bucket = &table->storage[pos >> STBDS_BUCKET_SHIFT];

      // start searching from pos to end of bucket
      for (i=pos & STBDS_BUCKET_MASK; i < STBDS_BUCKET_LENGTH; ++i) {
        if (bucket->hash[i] == hash) {
          if (stbds_is_key_equal(raw_a, elemsize, key, keysize, keyoffset, mode, bucket->index[i])) {
            stbds_temp(a) = bucket->index[i];
            if (mode >= STBDS_HM_STRING)
              stbds_temp_key(a) = * (char **) ((char *) raw_a + elemsize*bucket->index[i] + keyoffset);
            return STBDS_ARR_TO_HASH(a,elemsize);
          }
        } else if (bucket->hash[i] == 0) {
          pos = (pos & ~STBDS_BUCKET_MASK) + i;
          goto found_empty_slot;
        } else if (tombstone < 0) {
          if (bucket->index[i] == STBDS_INDEX_DELETED)
            tombstone = (ptrdiff_t) ((pos & ~STBDS_BUCKET_MASK) + i);
        }
      }

      // search from beginning of bucket to pos
      limit = pos & STBDS_BUCKET_MASK;
      for (i = 0; i < limit; ++i) {
        if (bucket->hash[i] == hash) {
          if (stbds_is_key_equal(raw_a, elemsize, key, keysize, keyoffset, mode, bucket->index[i])) {
            stbds_temp(a) = bucket->index[i];
            return STBDS_ARR_TO_HASH(a,elemsize);
          }
        } else if (bucket->hash[i] == 0) {
          pos = (pos & ~STBDS_BUCKET_MASK) + i;
          goto found_empty_slot;
        } else if (tombstone < 0) {
          if (bucket->index[i] == STBDS_INDEX_DELETED)
            tombstone = (ptrdiff_t) ((pos & ~STBDS_BUCKET_MASK) + i);
        }
      }

      // quadratic probing
      pos += step;
      step += STBDS_BUCKET_LENGTH;
      pos &= (table->slot_count-1);
    }
   found_empty_slot:
    if (tombstone >= 0) {
      pos = tombstone;
      --table->tombstone_count;
    }
    ++table->used_count;

    {
      ptrdiff_t i = (ptrdiff_t) stbds_arrlen(a);
      // we want to do stbds_arraddn(1), but we can't use the macros since we don't have something of the right type
      if ((size_t) i+1 > stbds_arrcap(a))
        *(void **) &a = stbds_arrgrowf(a, elemsize, 1, 0);
      raw_a = STBDS_ARR_TO_HASH(a,elemsize);

      STBDS_ASSERT((size_t) i+1 <= stbds_arrcap(a));
      stbds_header(a)->length = i+1;
      bucket = &table->storage[pos >> STBDS_BUCKET_SHIFT];
      bucket->hash[pos & STBDS_BUCKET_MASK] = hash;
      bucket->index[pos & STBDS_BUCKET_MASK] = i-1;
      stbds_temp(a) = i-1;

      switch (table->string.mode) {
         case STBDS_SH_STRDUP:  stbds_temp_key(a) = *(char **) ((char *) a + elemsize*i) = stbds_strdup((char*) key); break;
         case STBDS_SH_ARENA:   stbds_temp_key(a) = *(char **) ((char *) a + elemsize*i) = stbds_stralloc(&table->string, (char*)key); break;
         case STBDS_SH_DEFAULT: stbds_temp_key(a) = *(char **) ((char *) a + elemsize*i) = (char *) key; break;
         default:                memcpy((char *) a + elemsize*i, key, keysize); break;
      }
    }
    return STBDS_ARR_TO_HASH(a,elemsize);
  }

game_generate_moves_for_queen (busy.c:586)

int game_generate_moves_for_queen(Game *game, char piece)
{
    assert(piece > 0 && piece <= BP1);

    Position piece_position = game->positions[(int)piece];
    if(piece_position.s < 0)
        return 1;
    if(game_is_one_hive(game, piece_position)) {
        unsigned char neighbor_bitset = 0;
        for(int i = 0; i < DIRECTIONS; ++i) {
            Position adjacent_position = COFFSETS[piece_position.q % 2][i];
            adjacent_position.q = adjacent_position.q + piece_position.q;
            adjacent_position.r = adjacent_position.r + piece_position.r;
            adjacent_position.s = adjacent_position.s + piece_position.s;
            char adjacent_piece = game->board[
                adjacent_position.q +
                adjacent_position.r * COLUMNS +
                adjacent_position.s * COLUMNS * ROWS];
            if(adjacent_piece && adjacent_piece != piece)
                neighbor_bitset |= DIRECTION_BIT(i);
        }
        unsigned char available = available_neighbors(neighbor_bitset);
        int valid = 0;
        for(int i = 0; i < DIRECTIONS; ++i) {
            Position adjacent_position = COFFSETS[piece_position.q % 2][i];
            adjacent_position.q = adjacent_position.q + piece_position.q;
            adjacent_position.r = adjacent_position.r + piece_position.r;
            adjacent_position.s = adjacent_position.s + piece_position.s;
            if(available & DIRECTION_BIT(i)) {
                ++valid;
                Action action = {piece, piece_position, adjacent_position};
                hmputs(game->valid_actions, (ActionHashMapEntry){action});                 // line: 586
            }
        }
        return !(valid > 0);
    } else {
        return 1;
    }
}

game_generate_moves_for_piece (busy.c:643)

int game_generate_moves_for_piece(Game *game, char piece)
{
    switch(CBUGS[(int)piece]) {
        case QUEEN:
            return game_generate_moves_for_queen(game, piece);
        case BEETLE:
            return game_generate_moves_for_beetle(game, piece);
        case GRASSHOPPER:
            return game_generate_moves_for_grasshopper(game, piece);
        case SPIDER:
            return game_generate_moves_for_spider(game, piece);
        case ANT:
            return game_generate_moves_for_ant(game, piece);
        default:
            return 1;
    }
}

main (main.c:172)

    hmfree(game.valid_actions);

    for(int i = 1; i < PIECES; ++i)
    {
        game_generate_moves_for_piece(&game, i);              // line: 172
    }
Vaxeral commented 1 year ago

I find the output of valgrind to be wrong somehow. In the line if (hash < 2) hash += 2; hash is clearly initialized above.

turol commented 1 year ago

Action is partially uninitialized. There's a 3-byte gap between piece and source because of alignment. It goes away when you pack the struct. Zero-initialize with memset to avoid this.

N-R-K commented 1 year ago

Zero-initialize with memset to avoid this.

That's not a very foolproof solution because the padding bytes will end up becoming unspecified again the moment you store something in the struct ^0:

When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values.

For example:

struct { char a; int b; } thing = { .a = 1 };
// a is set to 1
// b is implicitly zero-ed
// padding bytes are unspecified

memset(&thing, 0, sizeof thing); // everything zero-ed out.

thing.b = 5; // padding bytes are unspecified again
turol commented 1 year ago

In practice the unspecified values will stay consistent. If you want to fix this "properly" you'll need to explicitly add padding bytes to the struct and zero them out on initialization.

N-R-K commented 1 year ago

In practice the unspecified values will stay consistent. If you want to fix this "properly" you'll need to explicitly add padding bytes to the struct and zero them out on initialization.

Aye, another way to solve this is to use a larger type so there is no padding (in this case: turning char piece into int piece). This way you won't have to worry about initializing the padding bytes separately (e.g struct { char piece; char pad[3]; ... }).

Vaxeral commented 1 year ago

Thanks! This seems to be the problem.

nothings commented 1 year ago

I'm glad your problem is solved. Unfortunately, that still leaves open the question of the stb_ds unit tests failing but only with address sanitizer enabled.

N-R-K commented 1 year ago

Unfortunately, that still leaves open the question of the stb_ds unit tests failing but only with address sanitizer enabled.

It fails for me even without ASan when compiled in C (there's another issue opened regarding the unit test failing: https://github.com/nothings/stb/issues/1251).

Compiled in C++ it works fine. The description section in the README says "will compile in C++" so maybe that's intended (?)

nothings commented 1 year ago

Ok, that's good to know, closing this one in favor of #1251