qlibs / mph

C++20 [Minimal] Static Perfect Hash library
176 stars 9 forks source link

Bug in mph::to #10

Closed jason7708 closed 1 month ago

jason7708 commented 1 month ago

I discover a bug in the to function when parameter data cannot be constant evaluated and -mbmi2 is not enabled. From this link on godbolt, you can see that the results differ between gcc and clang. The code with a bug is as follows:

T t;
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Warray-bounds"
__builtin_memcpy(&t, data.data(), sizeof(t));
#pragma GCC diagnostic pop
const auto index = T(data.size() * __CHAR_BIT__);
#ifdef __BMI2__
if constexpr (sizeof(t) <= sizeof(u32)) {
  return __builtin_ia32_bzhi_si(t, index);
} else if constexpr (sizeof(t) <= sizeof(u64)) {
  return __builtin_ia32_bzhi_di(t, index);
} else
#endif
  return t & ((T(1) << index) - T(1));

In the example from the link, data.size() = 4, so index = 32. According to the C++ standard, u32(1) << 32 is undefined behavior, which is why gcc and clang produce different results here. Special handling is needed for cases where index >= size. image

Additionally, the part with __builtin_memcpy also needs to consider the case where data.size() < sizeof(t). Otherwise, it may exceed the boundary. Also, we should initialize t since it may copy with a size smaller than sizeof(T).

T t{};
__builtin_memcpy(&t, data.data(), data.size() < sizeof(t) ? data.size() : sizeof(t));
const auto index = T(data.size() * __CHAR_BIT__);
#ifdef __BMI2__
if constexpr (sizeof(t) <= sizeof(u32)) {
 return __builtin_ia32_bzhi_si(t, index);
} else if constexpr (sizeof(t) <= sizeof(u64)) {
  return __builtin_ia32_bzhi_di(t, index);
} else {
#endif
  constexpr T size = sizeof(T) * __CHAR_BIT__;
  return index >= size ? t : t & ((T(1) << index) - T(1));
#ifdef __BMI2__
}
#endif

I have submitted a pull request to fix both of these issues.