Closed ivanpovazan closed 1 year ago
Tagging subscribers to 'os-ios': @steveisok, @akoeplinger See info in area-owners.md if you want to be subscribed.
Author: | ivanpovazan |
---|---|
Assignees: | - |
Labels: | `design-discussion`, `os-ios`, `area-NativeAOT-coreclr` |
Milestone: | 8.0.0 |
Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas See info in area-owners.md if you want to be subscribed.
Author: | ivanpovazan |
---|---|
Assignees: | - |
Labels: | `design-discussion`, `os-ios`, `area-NativeAOT-coreclr` |
Milestone: | 8.0.0 |
After some internal discussions with @rolfbjarne and @MichalStrehovsky one idea to tackle this limitation that came up, was to expose all managed methods with UnmanagedCallersOnly
attribute and change the Objective-C code to invoke managed methods symbolically. However, this approach also raises other questions e.g., how to handle non-blittable parameter types, etc.
Resolving tokens or looking up a lot of types or methods by name during startup is wasted time. It would be best to auto-generate C# code with embedded fully resolved references to the types and methods, and compile this code into the app.
However, this approach also raises other questions e.g., how to handle non-blittable parameter types, etc.
How are non-blittable parameters handled today?
How are non-blittable parameters handled today?
Simplified version:
A custom linker step generates the Objective-C version of the object. For every member a code is generated here. That contains part of the marshalling. The generated code calls mono_runtime_invoke
which in turn calls xamarin_bridge_runtime_invoke_method
and eventually bridges to this managed code which does the rest of the marshalling on the managed side. It has pretty high overhead for simple methods (eg. returning int
/bool
and taking no parameters) so it would be nice to improve at least that part of the design.
Resolving tokens or looking up a lot of types or methods by name during startup is wasted time.
I agree (this is one of the big reasons the static registrar exists in the first place).
It would be best to auto-generate C# code with embedded fully resolved references to the types and methods, and compile this code into the app.
I will try to look into this. There are however a few points:
NativeAOT compiler does its own trimming (which was partially disabled for the initial proof-of-concept, and we stilled executed ILLinker), and one idea floating around was to figure out how to only use NativeAOT's trimming.
Can we run the trimmer in analysis-only mode where it only figures out the roots for the static registrar to use without actually rewriting the outputs?
NativeAOT compiler does its own trimming (which was partially disabled for the initial proof-of-concept, and we stilled executed ILLinker), and one idea floating around was to figure out how to only use NativeAOT's trimming.
Can we run the trimmer in analysis-only mode where it only figures out the roots for the static registrar to use without actually rewriting the outputs?
I guess we could do that, or alternatively have the trimmer write its output, and then use it as input to the static registrar, but pass the original non-trimmed assemblies to the NativeAOT compiler.
Can we run the trimmer in analysis-only mode where it only figures out the roots for the static registrar to use without actually rewriting the outputs?
This feels a lot like some of the similar problems we have with custom steps, where they are mainly used as a pre-scanning mechanism. It would be nice to fold some of this into a separate tool. That would give some other benefits for the linker engineering as we wouldn't have to support as many features.
I'm still in the process of trying to understand this, so bear with me.
Do we expect the UnamangedCallersOnly
approach to have corresponding objective-c code to be generated similar to this: https://github.com/xamarin/xamarin-macios/blob/673cf3688622028ff9e390d4e58fbbc8ef06f3bf/tools/common/StaticRegistrar.cs#L3283-L4353
Can we avoid having to do the glue code and build the necessary structures/interfaces in C# directly? How well-defined is the obj-c ABI? I'm trying to map this to how we do COM/WinRT interop - those can get by without generating the extra native code, so I'm trying to understand where this is different.
Can we avoid having to do the glue code and build the necessary structures/interfaces in C# directly?
It's possible to create all the Objective-C data structures dynamically at runtime (using the Objetive-C Runtime API. This is not a good solution though, because of performance reasons (it's slow, and we'll have to create all the data structures at startup, so we'd be taking a pretty big startup hit - a few years ago it was ~2s for a macOS application on a performant desktop, on a slower mobile device it'll be much more).
How well-defined is the obj-c ABI?
It's well defined, but it's also a moving target (Apple adds to the ABI sometimes). They also make performance improvements often.
I'm trying to map this to how we do COM/WinRT interop - those can get by without generating the extra native code, so I'm trying to understand where this is different.
As mentioned above we can get by without generating the extra native code too, but it's slow.
The main problem is that for a number of C# classes, an equivalent Objective-C class must exist, and they must all exist before the app can launch, because we don't known which ones will be needed (because Objective-C classes can be referenced dynamically, and this is very often done in storyboards - UI described in XML and often loaded at startup).
So the problem becomes how to create Objective-C classes efficiently, and the answer is to do what Apple/Xcode does: write Objective-C code, and then the required data structures will all be written to disk and loaded as needed at runtime (and very efficiently, since Apple has optimized this quite heavily in the past, and even continues to optimize it pretty much every year).
Another problem is that we really want to minimize our (dirty) memory footprint, because in some cases our limitations are quite strict. Cretaing all the Objective-C data structures at runtime uses a significant amount of dirty memory compared to using constant memory.
My current plan is to write something like this:
class AppDelegate : NSObject, IUIApplicationDelegate {
// this method is written by the app developer
public override bool FinishedLaunching (UIApplication app, NSDictionary options)
{
// ...
}
// the following method is generated/injected by the static registrar for the method above
[UnmanagedCallersOnly (EntryPoint = "__registrar__uiapplicationdelegate_didFinishLaunching")]
static byte __registrar__DidFinishLaunchingWithOptions (IntPtr handle, IntPtr selector, IntPtr p0, IntPtr p1)
{
var obj = Runtime.GetNSObject (handle);
var p0Obj = Runtime.GetNSObject (p0);
var p1Obj = Runtime.GetNSObject (p1);
return obj.DidFinishLaunchingWithOptions (p0Obj, p1Obj);
}
}
extern BOOL __registrar__uiapplicationdelegate_init (AppDelegate self, SEL _cmd, UIApplication* p0, NSDictionary* p1);
@interface AppDelegate : NSObject<UIApplicationDelegate, UIApplicationDelegate> {
}
-(BOOL) application:(UIApplication *)p0 didFinishLaunchingWithOptions:(NSDictionary *)p1;
@end
@implementation AppDelegate {
}
-(BOOL) application:(UIApplication *)p0 didFinishLaunchingWithOptions:(NSDictionary *)p1
{
return __registrar__uiapplicationdelegate_didFinishLaunching (self, _cmd, p0, p1);
}
@end
It's possible to create all the Objective-C data structures dynamically at runtime (using the Objetive-C Runtime API. This is not a good solution though, because of performance reasons
I was thinking more in the sense of whatever data structures the objective-c compiler places in the generated object file, the managed compiler could also generate and put into it's own output, at least in theory (it's all object files in the end). That would fall apart if those structures are not well defined though.
It's possible to create all the Objective-C data structures dynamically at runtime (using the Objetive-C Runtime API. This is not a good solution though, because of performance reasons
I was thinking more in the sense of whatever data structures the objective-c compiler places in the generated object file, the managed compiler could also generate and put into it's own output, at least in theory (it's all object files in the end). That would fall apart if those structures are not well defined though.
Yes, we could potentially write an object file directly instead of going through Objectice-C code and compile that.
I think that would work, but it would also likely require some digging into the file format because it's not very well documented afaik (although clang is open source so it's not really hidden either).
I was thinking more in the sense of whatever data structures the objective-c compiler places in the generated object file, the managed compiler could also generate and put into it's own output, at least in theory (it's all object files in the end). That would fall apart if those structures are not well defined though.
I was looking into that in the past. It's certainly possible but non-trivial. The documentation is sparse. The data are stored in special sections in the object file.
If the Objective-C code was not doing any part of the fancy marshalling then perhaps it would be feasible. It would not necessarily be easier than just generating ObjC file, compiling it, and passing as another input to the linker.
My current plan is as follows:
Say we have a managed class that subclasses NSObject, and exports a method:
public partial class MyObject : NSObject {
[Export ("doSomething:")]
public void DoSomething (int abc)
{
}
}
We will generate the following wrapper code:
public partial class MyObject {
[UnmanagedCallersOnly (EntryPoint = "__MyObject___DoSomething__")]
static void __DoSomething__ (IntPtr handle, IntPtr sel, int abc)
{
var obj = (MyObject) Runtime.GetNSObject (handle);
obj.DoSomething (abc);
// process any other arguments to the managed method
}
}
And the following Objective-C class:
@interface MyObject : NSObject {
}
-(void) doSomething: (int) abc;
@end
@implementation AppDelegate {
}
-(void) doSomething: (int) abc
{
__MyObject___DoSomething__ (self, _cmd, abc);
}
@end
Note 1: the generated code isn't exactly as shown here, because there are many corner-cases that have to be handled, but this is the general idea.
Note 2: the above code should work when there's an AOT compiler, but it won't
when we're using the JIT, because the native symbol
__MyObject___DoSomething__
won't exist at build time. In that case, we'll
generate a lookup mechanism, something like this:
@interface MyObject : NSObject {
}
-(void) doSomething;
@end
@implementation AppDelegate {
}
typedef id (*__MyObject___DoSomething__func) (id self, SEL sel);
-(void) doSomething
{
static __MyObject___DoSomething__func __MyObject___DoSomething__;
xamarin_lookup_unmanagedcallersonly ((void **) &__MyObject___DoSomething__, "MyAssembly", "__MyObject___DoSomething__");
__MyObject___DoSomething__ (self, _cmd);
}
@end
where the xamarin_lookup_unmanagedcallersonly
function will look for the
UnmanagedCallersOnly trampoline when the function is first called.
/cc: @simonrozsival
@rolfbjarne would generate the code using a roslyn source generator?
I was looking into that in the past. It's certainly possible but non-trivial. The documentation is sparse. The data are stored in special sections in the object file.
Just a random thought - not sure if it's feasible. I see that objc supports __attribute__((weak))
on some things. Could we place that on a method? Could we make the objc-generated method body a weak symbol and generate a UnmanagedCallersOnly
method with the exact same mangled name and signature? We'd leave generating the objc data structures to the objc compiler, but provide our own method bodies and avoid the size/perf impact of the thunk.
I don't know how much effort is it worth to put into it - how many methods do we need to expose in an average app, and how costly is the objc method that just thunks to our managed implementation (looking at Rolf's example, maybe the implementation ends up being just a tail call, and then it's cheap - as opposed to something that needs to build a call frame).
My current plan is as follows:
This looks good - I have a couple questions:
obj.DoSomething (abc);
? Managed exceptions leaking though the UnmanagedCallersOnly
boundary would be a failfast.var obj = (MyObject) Runtime.GetNSObject (handle);
with Unsafe.As
depending on whether we assume the obj-c side to already be type safe. A cast costs about a dozen bytes, plus a small throughput cost.
- How will exceptions be handled? Do we need a try/catch around the
obj.DoSomething (abc);
? Managed exceptions leaking though theUnmanagedCallersOnly
boundary would be a failfast.
If an exception tries to exit a UnmanagedCallersOnly
function, there should be a ObjectiveCMarshal.UnhandledExceptionPropagationHandler
installed. This will translate managed exceptions into a native NSException
.
@rolfbjarne would generate the code using a roslyn source generator?
No, it's generated in a custom linker step when the trimmer runs, so it's too late to run any source generators (iow we use Cecil to generate the IL directly).
- We could also potentially replace the cast in
var obj = (MyObject) Runtime.GetNSObject (handle);
withUnsafe.As
depending on whether we assume the obj-c side to already be type safe.
Objective-C is not type-safe. In particular, Objective-C is known to blatantly lie about their types (for instance: their headers says an API returns type X, but it returns type Y, that only quacks like X, but isn't X at all).
A cast costs about a dozen bytes, plus a small throughput cost.
We're already doing a dictionary lookup (IntPtr -> object), so a cast is really a minor cost in the whole process.
I was looking into that in the past. It's certainly possible but non-trivial. The documentation is sparse. The data are stored in special sections in the object file.
Just a random thought - not sure if it's feasible. I see that objc supports
__attribute__((weak))
on some things. Could we place that on a method? Could we make the objc-generated method body a weak symbol and generate aUnmanagedCallersOnly
method with the exact same mangled name and signature? We'd leave generating the objc data structures to the objc compiler, but provide our own method bodies and avoid the size/perf impact of the thunk.
That's certainly an intriguing idea, but __attribute__((weak))
doesn't seem to work:
@interface MyObject : NSObject {
}
-(void) myFunc __attribute__((weak));
@end
results in:
test.m:9:31: warning: 'weak' attribute only applies to variables, functions, and classes [-Wignored-attributes] -(void) myFunc attribute((weak));
However, it might be possible to just not add the Objective-C implementation:
@interface MyObject : NSObject { }
-(void) myFunc;
@end
@implementation MyObject { }
// nothing here!
@end
and that compiles just fine, albeit with a warning:
test.m:12:17: warning: method definition for 'myFunc' not found
I wasn't able to figure out how to easily write a function named -[MyObject myFunc]
in C/assembly though, so I'm not sure if it would work at runtime.
No, it's generated in a custom linker step when the trimmer runs, so it's too late to run any source generators (iow we use Cecil to generate the IL directly).
Even though it is a separate topic, we should also think about how ILLinker and NativeAOT would work together - be compatible, or how NativeAOT would implement this (and other custom linker steps), as it seems there is an unavoidable dependency between Xamarin and trimming phase.
We will generate the following wrapper code:
public partial class MyObject { [UnmanagedCallersOnly (EntryPoint = "__MyObject___DoSomething__")] static void __DoSomething__ (IntPtr handle, IntPtr sel, int abc) { var obj = (MyObject) Runtime.GetNSObject (handle); obj.DoSomething (abc); // process any other arguments to the managed method } }
Do we need this for every callable method or could this be generated per method signature only and use some kind of aggresive inlining inside NativeAOT and get rid of this frame completely at runtime?
We will generate the following wrapper code:
public partial class MyObject { [UnmanagedCallersOnly (EntryPoint = "__MyObject___DoSomething__")] static void __DoSomething__ (IntPtr handle, IntPtr sel, int abc) { var obj = (MyObject) Runtime.GetNSObject (handle); obj.DoSomething (abc); // process any other arguments to the managed method } }
Do we need this for every callable method or could this be generated per method signature only and use some kind of aggresive inlining inside NativeAOT and get rid of this frame completely at runtime?
There are at least two issues:
sel
) will never be present in the target method. This means we'll always need to shuffle parameters around in the stack frame.[UnmanagedCallersOnly (EntryPoint = "__MyObject___DoSomething__")]
static void __DoSomething__ (IntPtr handle, IntPtr sel, int abc)
{
__Signature_void_int32 (handle, sel, abc, &MyObject.DoSomething);
}
[UnmanagedCallersOnly (EntryPoint = "__MyObject___DoSomethingElse__")]
static void __DoSomethingElse__ (IntPtr handle, IntPtr sel, int abc)
{
__Signature_void_int32 (handle, sel, abc, &MyObject.DoSomethingElse);
}
static void __Signature_void_int32 (IntPtr handle, IntPtr sel, int abc, ? method)
{
var obj = (MyObject) Runtime.GetNSObject (handle);
obj.(*method) (abc); // what would the type of 'method' be, and how would this be implemented in IL?
}
Maybe something like this could work
// delegate* assumes we can hardcode calling convention
static void __Signature_void_int32 (IntPtr handle, IntPtr sel, int abc, delegate*<object, void, int> method)
{
var obj = (MyObject) Runtime.GetNSObject (handle);
method (obj, abc);
}
@AaronRobinsonMSFT FYI
Maybe something like this could work
If we need the casting to guard type safety (which I assume we do based on the previous comment about Unsafe.As
), I think it would have the be per signature+per type of this
and that would limit the savings.
Even though it is a separate topic, we should also think about how ILLinker and NativeAOT would work together - be compatible, or how NativeAOT would implement this (and other custom linker steps), as it seems there is an unavoidable dependency between Xamarin and trimming phase.
What decides whether we need to generate the UnmanagedCallersOnly
method __DoSomething__
? Is it based on the presence of DoSomething
after trimming? And also what decides we need to generate the obj-c wrapper? Is it based on the presence of the UnmanagedCallersOnly
method?
Even though it is a separate topic, we should also think about how ILLinker and NativeAOT would work together - be compatible, or how NativeAOT would implement this (and other custom linker steps), as it seems there is an unavoidable dependency between Xamarin and trimming phase.
What decides whether we need to generate the
UnmanagedCallersOnly
method__DoSomething__
? Is it based on the presence ofDoSomething
after trimming? And also what decides we need to generate the obj-c wrapper? Is it based on the presence of theUnmanagedCallersOnly
method?
We need the Objective-C wrapper for methods in types subclassing Foundation.NSObject
, and either:
[Export ("selector")]
attribute[Export ("selector")]
attribute:[Protocol ("MyProtocol")]
interface IMyProtocol {
[Export ("myProtocolMethod")]
void MyProtocolMethod ();
}
class MyBaseObject : NSObject {
[Export ("myBaseMethod")]
protected virtual void MyBaseMethod () {} // this method gets an Objective-C wrapper method
}
class MyObject : MyBaseObject, IMyProtocol {
[Export ("myMethod:")]
void MyMethod () {} // this method gets an Objective-C wrapper method
public void MyProtocolMethod () {} // this method gets an Objective-C wrapper method
protected override void MyBaseMethod () {} // this method gets an Objective-C wrapper method
}
There are a couple of other scenarios as well, and numerous corner cases, but the general rule is that we need an Objective-C wrapper for any method with an Export attribute (directly or indirectly).
Then the UnmanagedCallersOnly method is needed whenever we have an Objective-C wrapper.
We need the Objective-C wrapper for methods in types subclassing
Foundation.NSObject
Is this "types subclassing Foundation.NSObject
that survived trimming", or are these rooted/never trimmed? How about the methods on these types - are those rooted or do we just need those methods that survived trimming? And if we allow trimming, how does it work if we were to trim all of IMyProtocol
, but keep MyObject.MyProtocolMethod
- would that method need a wrapper?
We need the Objective-C wrapper for methods in types subclassing
Foundation.NSObject
Is this "types subclassing
Foundation.NSObject
that survived trimming", or are these rooted/never trimmed?
We root all subclasses from Foundation.NSObject.
How about the methods on these types - are those rooted or do we just need those methods that survived trimming?
Same: we root all methods with an Export attribute (directly or indirectly as above).
And if we allow trimming, how does it work if we were to trim all of
IMyProtocol
, but keepMyObject.MyProtocolMethod
- would that method need a wrapper?
This is one of the reasons moving the registrar out of the custom linker steps is difficult: yes, MyObject.MyProtocolMethod
still needs a wrapper even if IMyProtocol
is trimmed away. We used to just root the interface to get around this, but what we do now is to store the interface in memory (using a custom linker step), so that the registrar later can figure out that MyObject.MyProtocolMethod
comes from such an interface, even if the interface will be trimmed away.
Somewhat unconventional opinion:
So if we can come up with a simple set of rules all trimming tools should follow around objective-C interop, personally I would be fine hardcoding these into the trimming tools (along with tests and everything). That said, ideally these behaviors should be runtime independent. That part might be problematic to achieve, so we may need to do some compromises. For example, if we think that object-C interop should be inherently supported on iOS targets, then we should hardcode all of it into the NativeAOT compiler (when it targets iOS).
It's part of our overall interop story (we try to make the complexities invisible from users) - @AaronRobinsonMSFT for confirmation.
Yes, this has been the case when we source generate something. The Trimmer specifically adds a decent amount of spooky behavior that can steal hours of your life, at least when working in the runtime. For users this is likely less of a concern, but enriching the Trimmer to do the right thing 90 % of the time is a reasonable path forward in my opinion. An alternative to pushing it into the Trimmer itself is to encode it in an analyzer that will warn/error. I prefer the tooling to error out however the C# UX generally prefers analyzers or something that runs at Design time to warn/error early.
We root all subclasses from Foundation.NSObject.
If I'm looking at the right code, this is more subtle:
If other platforms have a specific interop technology which is very common, like objective-C interop on iOS, I think it would make sense to hardcode knowledge of it into the trimming tools
Based on looking at https://github.com/xamarin/xamarin-macios/tree/b0c94b48a656b2b809467a87d8f2464a122ce2a7/tools/linker and around, I would prefer not to do such hardcoding, especially if we were required to duplicate the logic due to it being written in Cecil. The interop rules we hardcode for p/invokes and COM are well defined - they're actually public API contracts - they don't change. Looking through what the macios steps do - it's the opposite - it's special case after special case for internal implementation details of the macios interop library, and for various types in Apple SDK. It's a moving target with two free degrees of movement. This is more like WinRT++ than p/invokes.
We could express some of these relationships with DynamicDependency
, and it would work for ILLink, but DynamicDependency
is going to keep a lot more stuff than needed on the NativeAOT side (NativeAOT considers these "reflection used", which means it will generate a lot more data structures to support reflection with these - eliminating a lot of optimization opportunities).
I think running IL Linker with these steps and leaving breadcrumbs (with yet another custom step) as to what to keep when NativeAOT does its own trimming is a fine plan for .NET 8 or beyond. We can figure out the mechanisms to leave the breadcrumbs as the need arises.
@MichalStrehovsky those custom linker steps do several different (independent) things
[Preserve]
attributes, etc.)[assembly: LinkWith(...)]
attributes the types used in the codeI agree that the custom marking logic is very specific and I don't think it should be hardcoded into the nativeaot compiler. On the other hand, that part that generates the static registrar with those [UnmanagedCallersOnly]
trampolines and the Objective-C code that @rolfbjarne proposed in this thread would IMO fit somewhere in the nativeaot pipeline itself. Somewhere between when the dependency graph is built and before it starts generating native code.
All of our logic in macios is quite complex and full of corner cases, so I don't think it would be a good idea to have it anywhere else.
Going forward, if our desire is to remove the custom linker steps, I believe this is (a very high-level view of) our best approach:
DynamicDependency
attributes according to what we know is safe to for the trimmer to remove / want the trimmer to keep.I believe this requires a couple of things from the NativeAOT compiler:
The main downside I see of this approach is that it'll slow down the build:
One major upside is that it should improve the testability of our code (it's not trivial to run the custom linker steps outside of an actual build).
Going forward, if our desire is to remove the custom linker steps, I believe this is (a very high-level view of) our best approach
This looks a lot more complex than what we have discussed in https://github.com/dotnet/runtime/issues/80912#issuecomment-1400771752 . Would this simpler approach discussed earlier be an option?
It should understand the DynamicDependency attributes the same way that the trimmer does.
We already do that, but I would like to avoid DynamicDependency for size reasons. By default, when NativeAOT compiles a method, we generate three things: the actual code bytes and unwinding information (this is pretty standard stuff that even the C++ compiler generates), and precise GC information (if we decide we want to use conservative stack scanning like Mono does, this can be discarded for a ~5% size saving). That's it. You probably noticed that we don't generate the name of the method, information about the owning type, the parameters to the method, etc. None of that is needed to run the code. If the compiler however finds out this extra information is needed, it will generate it. There are many ways how compiler can "find out" - one of those is DynamicDependency, another is TrimmerRootAssembly, descriptors, etc. Forcing something to be generated as reflection visible is potentially several times more overhead that just generating the method body.
If we're thinking about solutions beyond custom steps, here's what I've been thinking about with source generators:
Let's say user writes what was in https://github.com/dotnet/runtime/issues/80912#issuecomment-1434662940. We generate the UnmanagedCallersOnly method with a source generator and additionally, generate a "UnmanagedDynamicDependency
" attribute on the class (or the constructors?) linking the constructors to the generated method. This would ensure that whenever the class is constructed, we generate the UCO wrapper (whether this will be a new attribute, or we say that DynamicDependency pointing to a UCO method with a named Entrypoint doesn't actually signal reflection use is an implementation detail).
The source generator could also generate any additional bookkeeping that is necessary for the static registrar to work (being handwavy here).
I don't have a good sense of what the custom steps rewrite. I saw things like rewriting IntPtr.Zero to 4/8 - those are unnecessary optimizations for NAOT, and IL Linker already does that on it's own too - it's not an optimization specific to macios and shouldn't be done by macios steps. It also rewrites other methods - those could be IL Linker substitutions (that NAOT also supports).
Once native compilation/trimming is done, a separate tool can run that will look at what's left (we'd have plugins that would inspect the results depending on what the result is - native or IL), will look at the original IL, and generate whatever native artifacts are needed to glue things together.
Going forward, if our desire is to remove the custom linker steps, I believe this is (a very high-level view of) our best approach
This looks a lot more complex than what we have discussed in #80912 (comment) . Would this simpler approach discussed earlier be an option?
I'm sorry I was unclear: my thought is to implement the simple approach for .NET 8, while the more complex is potentially for .NET 9+.
It should understand the DynamicDependency attributes the same way that the trimmer does.
We already do that, but I would like to avoid DynamicDependency for size reasons. By default, when NativeAOT compiles a method, we generate three things: the actual code bytes and unwinding information (this is pretty standard stuff that even the C++ compiler generates), and precise GC information (if we decide we want to use conservative stack scanning like Mono does, this can be discarded for a ~5% size saving). That's it. You probably noticed that we don't generate the name of the method, information about the owning type, the parameters to the method, etc. None of that is needed to run the code. If the compiler however finds out this extra information is needed, it will generate it. There are many ways how compiler can "find out" - one of those is DynamicDependency, another is TrimmerRootAssembly, descriptors, etc. Forcing something to be generated as reflection visible is potentially several times more overhead that just generating the method body.
If we're thinking about solutions beyond custom steps, here's what I've been thinking about with source generators:
Let's say user writes what was in #80912 (comment). We generate the UnmanagedCallersOnly method with a source generator and additionally, generate a "
UnmanagedDynamicDependency
" attribute on the class (or the constructors?) linking the constructors to the generated method. This would ensure that whenever the class is constructed, we generate the UCO wrapper (whether this will be a new attribute, or we say that DynamicDependency pointing to a UCO method with a named Entrypoint doesn't actually signal reflection use is an implementation detail).
Exactly how the UCO method is generated doesn't really matter in this discussion (as long as it happens before the NativeAOT compiler runs), it can either be as a custom linker step, a source generator, or using another MSBuild task that executes before the NativeAOT compiler, or something else entirely.
What matters however, is that we need a way to tell whomever does the treeshaking (be it illinker or NativeAOT) what can be trimmed away and what can't, and it would be highly desirable for us to have a single solution that works everywhere. If that's a DynamicDependency attribute, that's fine, if it's an xml descriptor, that's fine too.
Note that we don't only need to root API, we might also need to unroot API - I believe NativeAOT treats UCO methods with an EntryPoint as roots, and the behavior we need is that if another managed method (the one the UCO wrapper calls) survives trimming, then the UCO wrapper must exist, but otherwise it shouldn't.
So for the following example:
public partial class MyObject : NSObject {
[Export ("doSomething:")]
public void DoSomething (int abc)
{
}
[UnmanagedCallersOnly (EntryPoint = "__MyObject___DoSomething__")]
static void __DoSomething__ (IntPtr handle, IntPtr sel, int abc)
{
var obj = (MyObject) Runtime.GetNSObject (handle);
obj.DoSomething (abc);
// process any other arguments to the managed method
}
}
The __DoSomething__
method should only survive trimming if and only if DoSomething
did.
The source generator could also generate any additional bookkeeping that is necessary for the static registrar to work (being handwavy here).
I don't have a good sense of what the custom steps rewrite. I saw things like rewriting IntPtr.Zero to 4/8 - those are unnecessary optimizations for NAOT, and IL Linker already does that on it's own too - it's not an optimization specific to macios and shouldn't be done by macios steps. It also rewrites other methods - those could be IL Linker substitutions (that NAOT also supports).
The IntPtr.Size optimization wasn't very useful by itself, but it had a cascading effect in this scenario:
if (IntPtr.Size == 8) {
DoA ();
} else {
DoB ();
}
where we'd also remove the call to either DoA or DoB, and so on. This was a significant size improvement, because the linker at the time didn't know the target pointer size (and thus couldn't perform this optimization), and while Mono's AOT compiler would inline IntPtr.Size, it would not remove the unused DoX method. Since NativeAOT trims, this particular optimization should be unnecessary.
An example of an optimization I don't think NativeAOT would be able to do is this: https://github.com/rolfbjarne/xamarin-macios/blob/docs-custom-linker-steps/docs/custom-linker-steps/README.md#monotouchtunercoretypemapstep - code is optimized depending on whether a type is subclassed or not.
Once native compilation/trimming is done, a separate tool can run that will look at what's left (we'd have plugins that would inspect the results depending on what the result is - native or IL), will look at the original IL, and generate whatever native artifacts are needed to glue things together.
Yes, we'd need to be able to figure out which API NativeAOT removed and which it didn't (I'm assuming this information would be in some other format that's not IL?)
An example of an optimization I don't think NativeAOT would be able to do is this
I think that's one of the optimization that is actually very easy for NAOT (didn't check though if it does today already)
The DoSomething method should only survive trimming if and only if DoSomething did.
It'd be very nice if the native linker would remove it but I guess it could not and that's why we have the extra logic.
The DoSomething method should only survive trimming if and only if DoSomething did.
The following approach could solve this issue:
For a class MyObject
public partial class MyObject : NSObject
{
[Export ("doSomething:")]
public void DoSomething (int abc)
{
}
[Export ("doSomethingElse:")]
public void DoSomethingElse (int abc)
{
}
}
Generate:
public partial class MyObject : NSObject
{
[MyUCOWrapper(nameof(__DoSomething_wrapper__))]
[Export ("doSomething:")]
public void DoSomething (int abc)
{
}
[SoftRoot]
[UnmanagedCallersOnly (EntryPoint = "__MyObject___DoSomething__")]
static void __DoSomething_wrapper__ (IntPtr handle, IntPtr sel, int abc)
{
var obj = (MyObject) Runtime.GetNSObject (handle);
obj.DoSomething (abc);
// process any other arguments to the managed method
}
[MyUCOWrapper(nameof(__DoSomethingElse_wrapper__))]
[Export ("doSomethingElse:")]
public void DoSomethingElse (int abc)
{
}
[SoftRoot]
[UnmanagedCallersOnly (EntryPoint = "__MyObject___DoSomethingElse__")]
static void __DoSomethingElse_wrapper__ (IntPtr handle, IntPtr sel, int abc)
{
var obj = (MyObject) Runtime.GetNSObject (handle);
obj.DoSomethingElse (abc);
// process any other arguments to the managed method
}
}
Introduce soft roots for UCO methods residing in types subclassed from NSObject
.
Soft roots are UCO methods which are used for wrapping methods with the Export
attribute and are treated differently than regular UCO methods with EntryPoint
defined - hard roots.
The reason for this is to have a mechanism to only wrap/export methods with Export
attribute that survived the trimming phase.
To handle them with ILCompiler and prevent rooting before the dependency analysis starts there would be 2 options:
SoftRootAttribute
for this purposeNSObject
(could impose other problems like "regular UCO methods")
SoftRootAttribute
above is unnecessaryIntroduce an implicit dependency graph edge (possibly Conditional dependency
https://github.com/dotnet/runtime/blob/main/docs/design/coreclr/botr/ilc-architecture.md#dependency-types) for a method with Export
attribute and its UCO wrapper (e.g., DoSomething
-> __DoSomething_wrapper__
)
This would make sure that soft roots (UCO wrappers) are not trimmed out.
I am not sure what would be the right way to express this relationship other than with yet another attribute (above I used MyUCOWrapperAttribute
for simplicity) and a name of the wrapper it references. It would probably also make sense to utilize DynamicallyAccessedMembersAttribute
(this is more an implementation detail).
Having said all this, if a user code references MyObject::DoSomething
and does not reference MyObject::DoSomethingElse
,
there should be a relationship Program:Main-[static]->MyObject:DoSomething-(conditional)->MyObject:__DoSomething_wrapper__
which would keep all required methods for our usecase and remove DoSomethingElse
and its soft root wrapper __DoSomethingElse_wrapper__
It'd be very nice if the native linker would remove it but I guess it could not and that's why we have the extra logic.
I think we would still have a problem with the managed code referenced from the UCO itself e.g., if the code references some managed type the compiler would have to pack metadata info into internal data structures which will still take up the space but won't be needed during runtime.
This has been implemented in xamarin-macios now.
Description
Microsoft.macOS and Microsoft.iOS enable Objective-C runtime to create instances of C# classes through a type registration system. The type registration can be static – used for device builds and dynamic – used for emulators. At build time, the static registration inspects the assemblies used by the application through a custom linker step. It determines the classes and methods to register with Objective-C and generates a map, which is embedded into the binary. At the application startup, the map is registered with the Objective-C runtime (source).
However, to resolve addresses of registered types, type (and module) metadata tokens are used, which are not available in NativeAOT representations of managed types. This limitation prevents using NativeAOT for applications built on top of Microsoft.macOS and Microsoft.iOS.
This issue has been opened for discussion, possible approaches, ideas and suggestions on how to get pass this limitation with a goal of enabling NativeAOT to work with Xamarin.
/cc: @rolfbjarne
PS I would also like to give credit to @AustinWise who also reported this limitation in: https://github.com/dotnet/runtime/issues/77472