Open kotlarmilos opened 1 year ago
[ ] Investigate whether ILCompiler is able to detect the pattern
It does. For instance, this classlib project:
using System;
using System.Runtime.InteropServices;
#nullable disable
class C
{
public static bool WithLocal<T>()
{
Type k = typeof(T);
return k == typeof(sbyte) || k == typeof(byte);
}
public static bool WithoutLocal<T>()
{
return typeof(T) == typeof(sbyte) || typeof(T) == typeof(byte);
}
[UnmanagedCallersOnly(EntryPoint = nameof(BogusUsageToKeepFuncsInBinary))]
public static void BogusUsageToKeepFuncsInBinary() =>
Console.WriteLine(WithoutLocal<C>() && WithLocal<C>());
}
when built and inspected with:
# current runtime: linux-musl-arm64
$ dotnet8 publish -c Release -o dist --ucr -p:PublishAot=true
$ objdump -x dist/lib7.so | grep -E 'F.*With(out)?Local'
0000000000309730 l F __managedcode 0000000000000060 .hidden lib7_C__WithLocal<System___Canon>
0000000000309790 l F __managedcode 0000000000000014 .hidden lib7_C__WithoutLocal<System___Canon>
$ gdb dist/lib7.so -batch \
-ex "disassemble lib7_C__WithoutLocal<System___Canon>" \
-ex "disassemble lib7_C__WithLocal<System___Canon>"
gives:
Dump of assembler code for function lib7_C__WithoutLocal<System___Canon>:
0x0000000000309790 <+0>: stp x29, x30, [sp, #-16]!
0x0000000000309794 <+4>: mov x29, sp
0x0000000000309798 <+8>: mov w0, wzr
0x000000000030979c <+12>: ldp x29, x30, [sp], #16
0x00000000003097a0 <+16>: ret
End of assembler dump.
Dump of assembler code for function lib7_C__WithLocal<System___Canon>:
0x0000000000309730 <+0>: stp x29, x30, [sp, #-32]!
0x0000000000309734 <+4>: str x19, [sp, #24]
0x0000000000309738 <+8>: mov x29, sp
0x000000000030973c <+12>: str x0, [x29, #16]
0x0000000000309740 <+16>: ldr x0, [x0]
0x0000000000309744 <+20>: bl 0x2b30f0 <S_P_CoreLib_Internal_Runtime_CompilerHelpers_LdTokenHelpers__GetRuntimeType>
0x0000000000309748 <+24>: mov x19, x0
0x000000000030974c <+28>: nop
0x0000000000309750 <+32>: adr x0, 0x367f08
0x0000000000309754 <+36>: bl 0x2b30f0 <S_P_CoreLib_Internal_Runtime_CompilerHelpers_LdTokenHelpers__GetRuntimeType>
0x0000000000309758 <+40>: cmp x0, x19
0x000000000030975c <+44>: b.eq 0x309780 <lib7_C__WithLocal<System___Canon>+80> // b.none
0x0000000000309760 <+48>: nop
0x0000000000309764 <+52>: adr x0, 0x366250
0x0000000000309768 <+56>: bl 0x2b30f0 <S_P_CoreLib_Internal_Runtime_CompilerHelpers_LdTokenHelpers__GetRuntimeType>
0x000000000030976c <+60>: cmp x0, x19
0x0000000000309770 <+64>: cset x0, eq // eq = none
0x0000000000309774 <+68>: ldr x19, [sp, #24]
0x0000000000309778 <+72>: ldp x29, x30, [sp], #32
0x000000000030977c <+76>: ret
0x0000000000309780 <+80>: mov w0, #0x1 // #1
0x0000000000309784 <+84>: ldr x19, [sp, #24]
0x0000000000309788 <+88>: ldp x29, x30, [sp], #32
0x000000000030978c <+92>: ret
End of assembler dump.
WithLocal
currently has inefficient codegen. Note that Roslyn generatesWithLocal
-like code for switch-expressions: sharplab, so codegen of LessThan3
, LessThan4
and LessThan5
from sharplab sample is (unexpectedly) bad with NativeAOT. Mono can improve both (disjoint and inlined) forms from the get-go.
The mono AOT compiler does understand some of these patterns, i.e. by the code in intrinsics.c. The problem is generic sharing, which generates code where the type T is not exactly known, so a method like foo<int>
is implemented by a shared method foo<T_INT>
where T_INT is constrained to 'int' and enums whose base type is int.
In that case, an expression like typeof(T)==typeof(byte)
can be optimized away, but an expression like typeof(T)=typeof(int)
cannot.
Investigate whether ILCompiler is able to detect the pattern
I don't think ILCompiler does anything here, it's all down to RyuJIT which does it.
The mono AOT compiler does understand some of these patterns, i.e. by the code in intrinsics.c. The problem is generic sharing, which generates code where the type T is not exactly known, so a method like
foo<int>
is implemented by a shared methodfoo<T_INT>
where T_INT is constrained to 'int' and enums whose base type is int. In that case, an expression liketypeof(T)==typeof(byte)
can be optimized away, but an expression liketypeof(T)=typeof(int)
cannot.
Good point. With https://github.com/dotnet/runtime/issues/80941 we might be able to instruct Mono AOT compiler about referenced generic types in the program which are only statically reachable, and to allow the pattern specialization of generics.
I don't think ILCompiler does anything here, it's all down to RyuJIT which does it.
The proposed approach uses ILCompiler since we have already worked on its integration with Mono for iOS, which might timely confirm if it is feasible. Once it is confirmed, I suggest considering other options before the integration as well.
The problem is generic sharing, which generates code where the type T is not exactly known, so a method like foo
is implemented by a shared method foo where T_INT is constrained to 'int' and enums whose base type is int. In that case, an expression like typeof(T)==typeof(byte) can be optimized away, but an expression like typeof(T)=typeof(int) cannot.
Can't you just not share in this case?
The general premise here is that there is a large amount of generic code that exists which follows the pattern of:
if (typeof(T) == typeof(...))
{
// Logic for Type 1
}
else if (typeof(T) == typeof(...))
{
// Logic for Type 2
}
else
{
// Fallback path
}
The reason it follows this is because RyuJIT
has always specialized value types. The fallback path
is sometimes an actual shared path and sometimes a path which purely throws (such as in Vector###<T>
). In the case it the fallback is just a throw
the prior (typeof(T) == typeof(...))
checks define the entire domain of n
exact types that T
can be. So for something like Vector###<T>
there is no chance for it to be something like an Enum
, it can only be int
or uint
. In some cases (like Vector###<T>.operator +
) the int
/uint
paths are identical and could be shared and in other cases (like Vector###<T>.Abs
) they should be disjoint methods that are generated.
The biggest risk for USG is bad codegen (perfwise) and the biggest risk for specialization is code bloat. There is always going to be a balance, but provided the compiler tries to recognize the common patterns devs target we should end up generally in the right place. We can always look at providing some attribute in System.Runtime.CompilerServices
that allows devs to annotate the types
they would like specialized as well if they have more context than the compiler. Such a feature would allow us to annotate the exact 12 types for Vector###<T>
and Mono could then generate the shared path for everything else.
Description
In Mono AOT compiler,
typeof(T)
pattern results in unnecessary large and slow methods, especially GSHAREDVT variant (fallback method for any value-type). Making the AOT compiler understand thetypeof(T) == typeof(...)
pattern for value types and allow specialization for such scenarios could bring improvements.It has been discussed in https://github.com/dotnet/runtime/issues/71431 and https://github.com/dotnet/runtime/issues/71430.
Tasks
typeof(T) == typeof(...)
We still have open questions related to the integration and what is required to introduce it as an experimental feature in .NET 8, so comments and feedback are welcome.