Open laurentcau opened 2 years ago
removed
Technically, HLSL argument passing is all by-value, meaning copy-in (and copy-out for inout/out args). This means that if you pass a structure from the StructuredBuffer to the function, it's loading the entire structure, then copying it into the input argument of the function.
After inlining and some optimization, the aliasing is known, and in theory the load could sink into a branch. However, the branch gets eliminated by simplifycfg fairly early, even with the [branch]
attribute, since the operations in the branch at this point are just trivial extract+fcmp+branch again.
Even if the branch was preserved until later, DXC is very conservative with sinking/hoisting loads/stores due to the potential for certain properties not tracked well in the IR to complicate things and make the transformation illegal. Generally, back-ends should know best where to move/group loads and stores for performance.
We can keep this issue open to track improving cases such as this, and potentially:
[branch]
attribute
Hi,
I noticed a strange behavior of the compiler when using intermediate value. It doesn't generate the same code when using a structuredbuffer directly or through a variable. Here is an example:
I expected the compiler to generate the same code but it doesn’t.
Test1 generates code:
Test2 generates code:
The second version looks better since the memory fetch is done only when the fist condition is true.