Open jakobbotsch opened 2 months ago
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch See info in area-owners.md if you want to be subscribed.
We CSE the two zero extensions, but the heuristic then reasons that storing with a zero extension is free in the backend, so it doesn't give any profit to introducing the widened IV. It isn't able to reason that widening will allow the CSE'd local to be removed entirely.
***** BB07 [0001]
STMT00016 ( ??? ... ??? )
N004 ( 0, 0) [000119] DA--------- ▌ STORE_LCL_VAR int V02 loc2 d:3 $VN.Void
N003 ( 0, 0) [000118] ----------- └──▌ PHI int $301
N001 ( 0, 0) [000123] ----------- pred BB07 ├──▌ PHI_ARG int V02 loc2 u:4
N002 ( 0, 0) [000122] ----------- pred BB06 └──▌ PHI_ARG int V02 loc2 u:2 $c3
***** BB07 [0001]
STMT00004 ( 0x012[E-] ... 0x01E )
N016 ( 12, 20) [000021] DA--GO----- ▌ STORE_LCL_VAR int V04 tmp1 d:2 $VN.Void
N015 ( 12, 20) [000020] -A--GO----- └──▌ ADD int <l:$547, c:$548>
N013 ( 10, 18) [000043] nA--GO-N--- ├──▌ IND ushort <l:$4c3, c:$501>
N012 ( 6, 15) [000041] -A---O----- │ └──▌ ARR_ADDR byref ushort[] $84
N011 ( 6, 15) [000040] -A-----N--- │ └──▌ ADD byref $403
N009 ( 5, 6) [000039] -A-----N--- │ ├──▌ ADD long $3c9
N007 ( 4, 5) [000037] -A-----N--- │ │ ├──▌ LSH long $3c8
N005 ( 3, 4) [000130] -A--------- │ │ │ ├──▌ COMMA long $3c7
N003 ( 2, 3) [000128] DA--------- │ │ │ │ ├──▌ STORE_LCL_VAR long V05 cse0 d:1 $VN.Void
N002 ( 2, 3) [000035] ---------U- │ │ │ │ │ └──▌ CAST long <- uint $3c7
N001 ( 1, 1) [000032] ----------- │ │ │ │ │ └──▌ LCL_VAR int V02 loc2 u:3 $301
N004 ( 1, 1) [000129] ----------- │ │ │ │ └──▌ LCL_VAR long V05 cse0 u:1 $3c7
N006 ( 1, 1) [000036] -------N--- │ │ │ └──▌ CNS_INT long 1 $1c1
N008 ( 1, 1) [000038] ----------- │ │ └──▌ CNS_INT long 12 $1c3
N010 ( 3, 10) [000132] H---------- │ └──▌ CNS_INT(h) ref '003020600900305001001806400008102900700000008006708200002609500' $180
N014 ( 1, 1) [000019] ----------- └──▌ CNS_INT int -48 $c8
***** BB07 [0001]
STMT00005 ( ??? ... ??? )
N012 ( 7, 7) [000024] nA--GO----- ▌ STOREIND int $VN.Void
N010 ( 2, 3) [000055] -----O-N--- ├──▌ COMMA byref $85
N001 ( 0, 0) [000047] ----------- │ ├──▌ NOP void $VN.Void
N009 ( 2, 3) [000054] -----O----- │ └──▌ ARR_ADDR byref int[] $85
N008 ( 2, 3) [000053] -------N--- │ └──▌ ADD byref $404
N002 ( 1, 1) [000044] ----------- │ ├──▌ LCL_VAR ref V01 loc1 u:2 $2c0
N007 ( 3, 3) [000052] -------N--- │ └──▌ ADD long $3cc
N005 ( 2, 2) [000050] -------N--- │ ├──▌ LSH long $3cb
N003 ( 1, 1) [000131] ----------- │ │ ├──▌ LCL_VAR long V05 cse0 u:1 $3c7
N004 ( 1, 1) [000049] -------N--- │ │ └──▌ CNS_INT long 2 $1c4
N006 ( 1, 1) [000051] ----------- │ └──▌ CNS_INT long 16 $1c5
N011 ( 1, 1) [000022] ----------- └──▌ LCL_VAR int V04 tmp1 u:2 (last use) <l:$547, c:$548>
STMT00016 ( ??? ... ??? )
N004 ( 0, 0) [000119] DA--------- ▌ STORE_LCL_VAR int V02 loc2 d:3 $VN.Void
N003 ( 0, 0) [000118] ----------- └──▌ PHI int $301
N001 ( 0, 0) [000123] ----------- pred BB07 ├──▌ PHI_ARG int V02 loc2 u:4
N002 ( 0, 0) [000122] ----------- pred BB06 └──▌ PHI_ARG int V02 loc2 u:2 $c3
=> <L00, V02.2 (0), 1>
V02 is a primary induction variable in L00
Exit BB10 does not need a sink; V02 is not live-in
Estimated cycle improvement: 0 cycles per invocation
Estimated size improvement: 0 bytes
Widening is not profitable
With the extended analysis done in the strength reduction PR we should have enough context to replace all uses. With that PR we have
Found 8 different induction variables.
[0]: int <L00, 0, 1> (primary IV V02, 2 uses)
Uses: [000025] [000032]
[1]: int <L00, 1, 1> (derived IV, 2 uses)
Uses: [000008] [000027]
Score: 0
[2]: long <L00, 12, 2> (derived IV, 1 uses)
Uses: [000039] (AM)
Score: -2
[3]: long <L00, 0, 2> (derived IV, 1 uses)
Uses: [000037] (AM)
Score: -2
[4]: long zext<64>(<L00, 0, 1>) (derived IV, 4 uses)
Uses: [000131] [000035] [000129] [000130]
Score: -2
[5]: byref <L00, (V01.2 + 16), 4> (derived IV, 3 uses)
Uses: [000053] (AM) [000054] (AM) [000055]
Score: -2
[6]: long <L00, 16, 4> (derived IV, 1 uses)
Uses: [000052] (AM)
Score: -2
[7]: long <L00, 0, 4> (derived IV, 1 uses)
Uses: [000050] (AM)
Score: -2
and could notice that derived IV [4] will all be replaceable by the new primary IV introduced when widening [0].
Another relevant example is
private static int Foo(int[] arr, int start, int count)
{
int sum = 0;
for (int i = 0; i < count; i++)
{
sum += arr[start++];
}
return sum;
}
where we similarly end up with a local due to the start++
that blocks things.
Example from @richlander:
Loop codegen:
Expected: