dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.06k stars 4.69k forks source link

Support recognizing `Blend` as commutative #82365

Open tannergooding opened 1 year ago

tannergooding commented 1 year ago

We don't currently track Blend as commutative and so we miss an opportunity to ensure that "memory operands" always appear on the RHS: https://github.com/SixLabors/ImageSharp/pull/2359#discussion_r1111147888

Support should be added to treat this as commutative which really just entails inverting the constant (e.g. Blend(x, y, 0b0111_0111) is equivalent to Blend(y, x, 0b1000_1000)).

There are actually a few operations this applies to that involve two inputs + control mask. Not all of them are as trivially as inverting the constant, however.

ghost commented 1 year ago

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak See info in area-owners.md if you want to be subscribed.

Issue Details
We don't currently track `Blend` as commutative and so we miss an opportunity to ensure that "memory operands" always appear on the RHS: https://github.com/SixLabors/ImageSharp/pull/2359#discussion_r1111147888 Support should be added to treat this as commutative which really just entails inverting the constant (e.g. `Blend(x, y, 0b0111_0111)` is equivalent to `Blend(y, x, 0b1000_1000)`).
Author: tannergooding
Assignees: -
Labels: `area-CodeGen-coreclr`, `untriaged`
Milestone: -
IDisposable commented 1 year ago

I'm trying to grok this request (as I know nothing about Avx)

In the process, I noticed this code in the /src/tests/JIT/HardwareIntrinsics/X86/Avx1/Blend.cs test code at line 29-36, that looks wrong to me:

// SDDD SDDD
var vf3 = Avx.Blend(vf1, vf2, 1);
Unsafe.Write(floatTable.outArrayPtr, vf3);

if (!floatTable.CheckResult((x, y, z) => (z[0] == y[0]) && (z[1] == x[1]) &&
                                         (z[2] == x[2]) && (z[3] == x[3]) &&
                                         (z[4] == x[4]) && (z[5] == x[5]) &&
                                         (z[6] == x[6]) && (z[7] == x[7])))

I think the pattern in the comment is supposed to mean that we're expecting the Source or Destination values to appear based on the third argument (opmask) to the Avx.Blend() call which would make sense in float/_mm512_mask_blend_ps/VBLENDMPS case with a instruction mask of 0b0001

By that reading (which I may be wrong about), the test on line 35 against z[4] should be comparing to y[4] not x[4]. I am not sure if the comment is wrong (e.g. should be // SDDD DDDD) or if the operator value of 1 is wrong.

This is not area of any expertise at all given this is my very first look at AVX at any level, but the code at line 83-90 looks to match my expectation in that S in the comment means compare to y and D means compare to x.

// SDSD SDSD
vf3 = Avx.Blend(vf1, vf2, 85);
Unsafe.Write(floatTable.outArrayPtr, vf3);

if (!floatTable.CheckResult((x, y, z) => (z[0] == y[0]) && (z[1] == x[1]) &&
                                         (z[2] == y[2]) && (z[3] == x[3]) &&
                                         (z[4] == y[4]) && (z[5] == x[5]) &&
                                         (z[6] == y[6]) && (z[7] == x[7])))

If I'm absolutely nuts, please let me know, I don't want to start down the road with the wrong understanding.

JulieLeeMSFT commented 1 year ago

@tannergooding, could you guide @IDisposable for this?

tannergooding commented 1 year ago

We won't be able to land this in .NET 8; but it can be done at any point in the future.

I am not sure if the comment is wrong

The comment is wrong, the behavior is:

IF (IMM8[0] = 0) THEN DEST[31:0] :=SRC1[31:0]
    ELSE DEST [31:0] := SRC2[31:0] FI
IF (IMM8[1] = 0) THEN DEST[63:32] := SRC1[63:32]
    ELSE DEST [63:32] := SRC2[63:32] FI
IF (IMM8[2] = 0) THEN DEST[95:64] := SRC1[95:64]
    ELSE DEST [95:64] := SRC2[95:64] FI
IF (IMM8[3] = 0) THEN DEST[127:96] := SRC1[127:96]
    ELSE DEST [127:96] := SRC2[127:96] FI
IF (IMM8[4] = 0) THEN DEST[159:128] := SRC1[159:128]
    ELSE DEST [159:128] := SRC2[159:128] FI
IF (IMM8[5] = 0) THEN DEST[191:160] := SRC1[191:160]
    ELSE DEST [191:160] := SRC2[191:160] FI
IF (IMM8[6] = 0) THEN DEST[223:192] := SRC1[223:192]
    ELSE DEST [223:192] := SRC2[223:192] FI
IF (IMM8[7] = 0) THEN DEST[255:224] := SRC1[255:224]
    ELSE DEST [255:224] := SRC2[255:224] FI.

So it is not done as 1 operation repeated across 2x128-bit lanes, but rather as 1x256-bit lane instead.

IDisposable commented 1 year ago

Excellent, I'll tackle this for the next release. Thanks for the confirmation @tannergooding