Dougall and I determined the encoding by symmetry with other bit scan
instructions (ffs, bitrev, and nebulously popcount). To confirm the
behaviour, I hacked the compiler to replace imul with interleave, and
then wrote the following shader-runner test to check every pair of
16-bit inputs, which passes in under a second on M1:
[require]
GL ES >= 3.1
GLSL ES >= 3.10
[compute shader]
#version 310 es
layout (local_size_x = 32, local_size_y = 1) in;
layout(binding = 0) uniform atomic_uint good;
layout(binding = 0) uniform atomic_uint bad;
uint reference(uint x, uint y) {
uint z = 0u;
for (uint i = 0u; i < 16u; ++i) {
z |= ((x & (1u << i)) << i);
z |= ((y & (1u << i)) << (i + 1u));
}
return z;
}
uint result(uint x, uint y) {
/* overloaded */
return x * y;
}
void main (void)
{
uint x = uint(gl_GlobalInvocationID.x);
bool allOk = true;
for (uint y = 0u; y < 65536u; ++y) {
if ((reference(x, y) != result(x, y)))
allOk = false;
}
if (allOk)
atomicCounterIncrement(good);
else
atomicCounterIncrement(bad);
}
[test]
atomic counters 2
compute 2048 1 1
probe atomic counter 0 == 65536
probe atomic counter 1 == 0
Dougall and I determined the encoding by symmetry with other bit scan instructions (ffs, bitrev, and nebulously popcount). To confirm the behaviour, I hacked the compiler to replace imul with interleave, and then wrote the following shader-runner test to check every pair of 16-bit inputs, which passes in under a second on M1:
Signed-off-by: Alyssa Rosenzweig alyssa@rosenzweig.io