kahypar / mt-kahypar

Mt-KaHyPar (Multi-Threaded Karlsruhe Hypergraph Partitioner) is a shared-memory multilevel graph and hypergraph partitioner equipped with parallel implementations of techniques used in the best sequential partitioning algorithms. Mt-KaHyPar can partition extremely large hypergraphs very fast and with high quality.
MIT License
126 stars 26 forks source link

"misaligned address" problem with gcc 12 and gcc 13 #192

Open N-Maas opened 1 month ago

N-Maas commented 1 month ago

TL;DR: switching to gcc 14 or gcc 11 fixes the errors

With gcc 12 and gcc 13, running the tests in debug mode with address sanitizer enabled causes "misaligned address" errors that look like this (compare #188):

/usr/include/oneapi/tbb/parallel_invoke.h:40:5: runtime error: reference binding to misaligned address 0x76d6aa343320 for type 'struct __as_base ', which requires 64 byte alignment
0x76d6aa343320: note: pointer points here
 d6 76 00 00  40 00 00 00 00 00 00 00  01 00 00 00 00 00 00 00  20 8f cc 94 d6 76 00 00  40 00 00 00
              ^ 
/usr/include/oneapi/tbb/parallel_invoke.h:40:5: runtime error: member access within misaligned address 0x76d6aa343320 for type 'struct function_invoker', which requires 64 byte alignment
0x76d6aa343320: note: pointer points here
 d6 76 00 00  40 00 00 00 00 00 00 00  01 00 00 00 00 00 00 00  20 8f cc 94 d6 76 00 00  40 00 00 00
              ^ 

This issue should ensure awareness of the problem. However, as discussed in the following, there is likely no action required.

The errors only happen when using either gcc 12 or gcc 13 (tested on Ubuntu 24.04). Neither older gcc versions, gcc 14.0.1 nor clang cause the same problem. The stack trace points into TBB internals, it seems that a TBB struct declared as alignas(64) is misaligned. However, as far as I can tell the struct is just stack allocated, so the alignment should be ensured by the compiler itself.

Overall, I'm not sure whether it is UB in our code, UB in TBB or a compiler bug. However, the fact that it only happens with two specific gcc versions makes the last option actually seem plausible.