Open e-kayrakli opened 5 months ago
Is this by any chance related to op reduce array
expressions requiring element-wise host <-> device transfers?
Is this by any chance related to op reduce array expressions requiring element-wise host <-> device transfers?
I don't think so, but I am also unclear what this refers to. I would imagine an element-wise transfer could be necessary if the reduce operation is being executed outside of a GPU locale while using a GPU-based array, but not sure what kind of code could lead to that. Could you elaborate more?
It looks like --report-gpu
might not work right for promotions, reductions, or loop expressions
We have some logic in the compiler where GPU transforms could fail much later due to various reasons. An example is unsupported reductions. Here's an example with
&&
reduction:In the code above,
assertOnGpu
fails correctly. However, removing it and compiling with--report-gpu
reports the loop to be eligible.FWIW, the compiler code for GPU transforms could use a big rework. The whole "late gpuization failure" logic seems to be a workaround that we should avoid. I would like us to not try to put a band-aid on this
--report-gpu
issue, but just fix it by way of reworking the compiler implementation, instead.