Open karolherbst opened 2 months ago
I dug a bit deeper and it seems I have threads racing on the if(oldValues[csIndex] == 1000000000)
. Should that be an atomic operation instead?
mhh, no, that would mean that something with atomic_flag_test_and_set_explicit
is going wrong, but that's kinda of weird.. maybe there is a coherency issue somewhere, anyway, it's probably not a CTS bug then and I initially thought something else is going on.
This is the kernel being executed in the first
atomic_flag
test, but I'm actually wondering what in the OpenCL spec guarnatees that this executes as expected by the CTS.More specifically, what prevents a thread to race with a thread from a different subgroup on
destMemory[cnt]
. One thread could enter the iteration for e.g.cnt == 5
, while another thread just reachedatomic_flag_clear_explicit(&destMemory[cnt],
for the samecnt
, which means that two threads will enter the critical section, just the second one won't do anything.I'm seeing this behavior with rusticl on zink on radv and at the moment it's not clear to me if that's my or the CTS' bug. Removing the
atomic_flag_clear_explicit(&destMemory[cnt],
makes only one thread execute the "criticial section" for each value ofcnt
, but that fails the test later.So what's actually guaranteeing the correctness of this test here? Or would this test need to be rewritten? I'm also mildly wondering what clvk did in order to pass this test or if that was never a problem in the first place. I can probably come up with a fix to guarantee that behavior, I'm just wondering if there is some undefined behavior at play here.