mogill / ems

Extended Memory Semantics - Persistent shared object memory and parallelism for Node.js and Python
Other
597 stars 41 forks source link

potential GCC optimizations based on undefined behaviour #12

Closed David-Haim-zz closed 8 years ago

David-Haim-zz commented 8 years ago

Hi there, great library.

I have briefly looked into the code, and I have stumbled some potential problem with emsMem struct.

ems struct is declared as :

struct emsMem { int32_t level; uint8_t tree[1]; };

but tree is not an array of one byte, it is actually a slice of a bigger array allocated in emsMem_new:

struct emsMem self = (struct emsMem ) malloc(sizeof(struct emsMem) + sizeof(uint8_t) * (size * 2 - 2));

yet, what GCC see in other functions is basically access to tree as uint8_t(*)[1], for example

self->tree[index] == BUDDY_UNUSED

where index can be any number.

for GCC, this is undefined behavior if index != 0 (and it is, for most cases ) , even though tree is a valid allocated block, but tree is declared as uint8_t[1].

GCC is known to use optimizations that rely on undefined behavior. an improvement is to turn tree from uint8_t[1] to uint8_t*. for GCC, it cannot determine if tree[index] is undefined behavior or not at compile time, thus minimizing the possibility of undefined-behavior based optimizations.

mogill commented 8 years ago

Hi David,

Great eye, that tree[1] is a bit of a FORTRANism, not ideal C style. However, that code does have well-defined semantics, and everything would break horribly if the address generation was optimized away, so this problem couldn't lurk.

The buddy allocator isn't very sophisticated, but helps keep the build simple compared to redistributing a high quality library like jemalloc. The long-term plan is to eliminate the internal allocator and use pmem (which includes jemalloc) directly so EMS inherits all the persistence and transactional properties of the underlying OS and hardware. Good stuff, but still too new for mainstream Linux, so the built-in allocator will be around for a while.

I'm going to leave it as-is. Yours is the first review that code has had, and I really do appreciate it. If I missed something, don't hesitate to re-open this.

                  -J