static unsafe void Iterate(int* nums, nuint cnt) {
var sum = 0;
var iter = new PtrIter<int>(nums, cnt);
while (iter.Next(out var n)) {
sum += n;
}
Console.WriteLine(sum);
}
unsafe struct PtrIter<T>(T* ptr, nuint count)
where T: unmanaged {
public bool Next(out T item) {
if (count != 0) {
item = *ptr;
ptr++;
count--;
return true;
}
item = default;
return false;
}
}
Iterate compiles to
G_M000_IG01: ;; offset=0x0000
sub rsp, 40
G_M000_IG02: ;; offset=0x0004
xor eax, eax
test rdx, rdx
je SHORT G_M000_IG04
align [0 bytes for IG03]
G_M000_IG03: ;; offset=0x000B
mov r8d, dword ptr [rcx]
add rcx, 4
dec rdx
add eax, r8d
test rdx, rdx ;; <-- if we reorder dec and add, this test becomes redundant as j.cc can simply consume the flag
jne SHORT G_M000_IG03
G_M000_IG04: ;; offset=0x001D
mov ecx, eax
call [System.Console:WriteLine(int)]
nop
G_M000_IG05: ;; offset=0x0026
add rsp, 40
ret
which is quite a bit worse than doing similar with a plain array foreach:
G_M000_IG02: ;; offset=0x0000
xor eax, eax
mov edx, dword ptr [rcx+0x08]
test edx, edx
jle SHORT G_M000_IG05
G_M000_IG03: ;; offset=0x0009
add rcx, 16
align [0 bytes for IG04]
G_M000_IG04: ;; offset=0x000D
add eax, dword ptr [rcx]
add rcx, 4
dec edx
jne SHORT G_M000_IG04
G_M000_IG05: ;; offset=0x0017
mov ecx, eax
G_M000_IG06: ;; offset=0x0019
tail.jmp [System.Console:WriteLine(int)]
Analysis
The test could be elided if JIT gains the ability to perform a peephole which reorders numeric operations where there are potential consumers for the flags that they set.
Another minor note is a missed opportunity to merge mov and add.
I have also noticed that merging pointer dereference and post-increment into *ptr++ leads to worse codegen overall (breaking otherwise perfect output for ARM64), even though it shouldn't.
Description
Given simple program
Iterate
compiles towhich is quite a bit worse than doing similar with a plain array foreach:
Analysis
The
test
could be elided if JIT gains the ability to perform a peephole which reorders numeric operations where there are potential consumers for the flags that they set.Another minor note is a missed opportunity to merge
mov
andadd
.I have also noticed that merging pointer dereference and post-increment into
*ptr++
leads to worse codegen overall (breaking otherwise perfect output for ARM64), even though it shouldn't.Configuration
Regression?
No