Closed swesterfeld closed 1 year ago
Left shifting a 32bit value by 32 bits is UB.
The only reason to define ror32() is that it compiles into ROR with clang and gcc at -O2.
Have you checked that this is still true for the generated assembly with &31? If not, we might need to use a conditional.
And, since this is based on the PCG implementations, have you checked upstream (PCG here) how UB for ror is handled there?
Also, there is https://en.cppreference.com/w/cpp/numeric/rotr now, would make sense to verify the generated assembly for it.
The only reason to define ror32() is that it compiles into ROR with clang and gcc at -O2.
Have you checked that this is still true for the generated assembly with &31? If not, we might need to use a conditional.
Yes, both gcc and clang generate ror for the new code.
And, since this is based on the PCG implementations, have you checked upstream (PCG here) how UB for ror is handled there?
For the minimal implementation (https://www.pcg-random.org/download.html#id1), they simply use this (which also compiles to ror - checked on compiler explorer):
uint32_t
ror32 (const uint32_t bits, const uint32_t offset)
{
// bitwise rotate-right pattern recognized by gcc & clang iff 32==sizeof (bits)
return (bits >> offset) | (bits << (-offset & 31));
}
So without subtracting from 32. I guess it is a matter of taste which version is "better".
The C implementation has:
/*
* Rotate helper functions.
*/
[...]
/* Unfortunately, clang is kinda pathetic when it comes to properly
* recognizing idiomatic rotate code, so for clang we actually provide
* assembler directives (enabled with PCG_USE_INLINE_ASM). Boo, hiss.
*/
[...]
inline uint32_t pcg_rotr_32(uint32_t value, unsigned int rot)
{
#if PCG_USE_INLINE_ASM && __clang__ && (__x86_64__ || __i386__)
asm ("rorl %%cl, %0" : "=r" (value) : "0" (value), "c" (rot));
return value;
#else
return (value >> rot) | (value << ((- rot) & 31));
#endif
}
However, if I remove the inline and paste the code into compiler explorer, clang does the right thing even without using the assembly code.
I gave up on their C++ implementation though, it is simply too generic code for me to quickly check what it does
Also, there is https://en.cppreference.com/w/cpp/numeric/rotr now, would make sense to verify the generated assembly for it.
Generates the same "correct" code, but is C++20.
Left shifting a 32bit value by 32 bits is UB.