troydhanson / uthash

C macros for hash tables and more
Other
4.18k stars 926 forks source link

Clang identifies the potential to save some memory by reodering the components inside struct UT_hash_table #118

Open aaronmdjones opened 7 years ago

aaronmdjones commented 7 years ago

Building an application using uthash with Clang 3.8 and -Wpadded yields the following diagnostic:

./lib/utils/uthash.h:1056:27: warning: padding struct 'struct UT_hash_table' with 4 bytes to align 'tail' [-Wpadded]
   struct UT_hash_handle *tail; /* tail hh in app order, for fast append    */
                          ^

This is because it is a pointer value but it is located after a number of smaller variables whose total alignment does not equal that of the pointer.

Rearranging the structure, from:

typedef struct UT_hash_table {
   UT_hash_bucket *buckets;
   unsigned num_buckets, log2_num_buckets;
   unsigned num_items;
   struct UT_hash_handle *tail;
   ptrdiff_t hho;
   unsigned ideal_chain_maxlen;
   unsigned nonideal_items;
   unsigned ineff_expands, noexpand;
   uint32_t signature;
#ifdef HASH_BLOOM
   uint32_t bloom_sig;
   uint8_t *bloom_bv;
   uint8_t bloom_nbits;
#endif
} UT_hash_table;

To:

typedef struct UT_hash_table {
   struct UT_hash_handle *tail;
   UT_hash_bucket *buckets;
   ptrdiff_t hho;
   unsigned num_buckets, log2_num_buckets;
   unsigned num_items;
   unsigned ideal_chain_maxlen;
   unsigned nonideal_items;
   unsigned ineff_expands, noexpand;
   uint32_t signature;
#ifdef HASH_BLOOM
   uint32_t bloom_sig;
   uint8_t *bloom_bv;
   uint8_t bloom_nbits;
#endif
} UT_hash_table;

Reduces its size from 64 bytes:

(gdb) print /s sizeof(struct UT_hash_table)
$1 = 64

To 56 bytes:

(gdb) print /s sizeof(struct UT_hash_table)
$1 = 56

... and eliminates the diagnostic, making uthash build cleaner under common warnings, while saving a small amount of memory.

For more information, see here.

Quuxplusone commented 7 years ago

Changing the layout of that struct would break ABI compatibility (e.g. if someone were fwriting a hash node to disk, or passing a pointer to a hash node across an ABI boundary), so we'd have to be careful if we did this at all. I'd want to at least keep the old layout accessible via -DUTHASH_OLD_LAYOUT or something; and I'd want someone to look into the effect on "tests/hashscan.c" (I think it'd be fine but I'm not sure off the top of my head).

I'm inclined to say this is a "Won't Fix" unless someone is proposing a specific patch.

aaronmdjones commented 7 years ago

I understand, and I have created such a patch, but I'm having trouble building tests/hashscan.c:

/tmp/hashscan-df36b2.o: In function `infer_hash_function':
hashscan.c:(.text+0x205d): undefined reference to `HASH_MUR'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
aaronmdjones commented 7 years ago

Nevermind, I got it to work (you might want to document in the comments in the hashscan.c file that HASH_USING_NO_STRICT_ALIASING needs to be defined).

The patch is here (I can create a pull request if you like) and the suite seems to work:

 $ clang-3.8 -DHASH_USING_NO_STRICT_ALIASING -DUTHASH_COMPACT_HASHTABLE_LAYOUT -Wpadded hashscan.c -o hashscan
In file included from hashscan.c:44:
./uthash.h:1093:16: warning: padding size of 'struct UT_hash_table' with 3 bytes to alignment boundary [-Wpadded]
typedef struct UT_hash_table {
               ^
hashscan.c:52:9: warning: padding size of 'vma_t' with 7 bytes to alignment boundary [-Wpadded]
typedef struct {
        ^
2 warnings generated.

 $ ./hashscan -v 21876
attaching to peer
waiting for peer to suspend temporarily
opening peer memory map [/proc/21876/maps]
listing peer virtual memory areas
peer has 39 virtual memory areas
opening peer memory
scanning peer memory for hash table signatures
found signature at peer 0x7faf675237d4
Address            ideal    items  buckets mc fl bloom/sat fcn keys saved to
------------------ ----- -------- -------- -- -- --------- --- -------------
0x7faf675237a0      100%        7       32  2 ok           ??? 
detaching and resuming peer

(Function is ??? because I have implemented my own HASH_FUNCTION using SipHash-2-4)

aaronmdjones commented 7 years ago

(I also moved bloom_bv up once space to align that, too)

tbeu commented 7 years ago

The patch is here (I can create a pull request if you like) and the suite seems to work:

I'd prefer such a pull request. Saving memory is more relevant for my application than ABI compatibility.