ASE: randomhash: fix undefined behaviour in Pcg32Rng

swesterfeld commented 1 year ago

Left shifting a 32bit value by 32 bits is UB.

tim-janik commented 1 year ago

Left shifting a 32bit value by 32 bits is UB.

The only reason to define ror32() is that it compiles into ROR with clang and gcc at -O2.

Have you checked that this is still true for the generated assembly with &31? If not, we might need to use a conditional.

And, since this is based on the PCG implementations, have you checked upstream (PCG here) how UB for ror is handled there?

tim-janik commented 1 year ago

Also, there is https://en.cppreference.com/w/cpp/numeric/rotr now, would make sense to verify the generated assembly for it.

swesterfeld commented 1 year ago

The only reason to define ror32() is that it compiles into ROR with clang and gcc at -O2.

Have you checked that this is still true for the generated assembly with &31? If not, we might need to use a conditional.

Yes, both gcc and clang generate ror for the new code.

And, since this is based on the PCG implementations, have you checked upstream (PCG here) how UB for ror is handled there?

For the minimal implementation (https://www.pcg-random.org/download.html#id1), they simply use this (which also compiles to ror - checked on compiler explorer):

uint32_t
ror32 (const uint32_t bits, const uint32_t offset)
{
   // bitwise rotate-right pattern recognized by gcc & clang iff 32==sizeof (bits)
  return (bits >> offset) | (bits << (-offset & 31));
}

So without subtracting from 32. I guess it is a matter of taste which version is "better".

The C implementation has:

/*
 * Rotate helper functions.
 */

[...]
/* Unfortunately, clang is kinda pathetic when  it comes to properly
 * recognizing idiomatic rotate code, so for clang we actually provide
 * assembler directives (enabled with PCG_USE_INLINE_ASM).  Boo, hiss.
 */
[...]
inline uint32_t pcg_rotr_32(uint32_t value, unsigned int rot)
{
#if PCG_USE_INLINE_ASM && __clang__ && (__x86_64__  || __i386__)
    asm ("rorl   %%cl, %0" : "=r" (value) : "0" (value), "c" (rot));
    return value;
#else
    return (value >> rot) | (value << ((- rot) & 31));
#endif
}

However, if I remove the inline and paste the code into compiler explorer, clang does the right thing even without using the assembly code.

I gave up on their C++ implementation though, it is simply too generic code for me to quickly check what it does

swesterfeld commented 1 year ago

Also, there is https://en.cppreference.com/w/cpp/numeric/rotr now, would make sense to verify the generated assembly for it.

Generates the same "correct" code, but is C++20.

tim-janik / anklang

ASE: randomhash: fix undefined behaviour in Pcg32Rng #14