NVIDIA / Fuser

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
Other
258 stars 51 forks source link

ElectSync predicate is not working as expected due to index hoisting #3199

Open zasdfgbnm opened 4 hours ago

zasdfgbnm commented 4 hours ago

Due to index hoisting, the generated kernel looks like:

bool b = electSync();
for {
  if (b) {
    do something;
  }
}
for {
  if (b) {
    do something else;
  }
}

That is, we elect one thread at the beginning of the kernel, and use that thread in the entire kernel, instead of electing a ready thread everytime when there is an electSync().

zasdfgbnm commented 4 hours ago

cc @rdspring1