dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.16k stars 4.72k forks source link

[API Proposal]: "*App*" wildcard for [UnsafeAccessor] assembly name #106216

Closed Sergio0694 closed 1 month ago

Sergio0694 commented 2 months ago

Background

As part of CsWinRT 2.1.1, we introduced AOT support in WinRT scenarios. In order to make this happen, CsWinRT now also generates a fair amount (read: a metric ton) of code to make things such as WinRT generics work. This support comes with several drawbacks though. The first one is that it is viral: we need every single library to reference CsWinRT in order to make types AOT compatible when used for WinRT interop. This doesn't just affect libraries that are specific to Windows, but any library which contains types that will be used in WinRT scenarios. For instance, I've had to add the Windows TFM to the MVVM Toolkit, so that the types in the MVVM Toolkit could get the necessary supporting code generated. This isn't really fixable, it's just the state of things now that WinRT support is lifted.

On top of this however, we have two main problems that we'd like to solve:

For instance, here's just some of these supporting types generated in the MVVM Toolkit:

image

It's very easy for a large app to have even hundreds of these spread across all your dependencies and projects. And it's not uncommon that a lot of these will just be copies and copies of the same combinations of type arguments (eg. string).

We'd like improve things and move as much code generation as possible down to the published projects.

API proposal

The proposal is not an API, but an extension on top of #90081. We'd like to add a new "*App*" wildcard that can be used in place of assembly names for [UnsafeAccessor] methods. The exact semantics would be controlled via a new feature switch that can be configured via the new UnsafeAccessorOutputAssemblyResolutionType property. There would be two modes:

Example use

Consider the current CsWinRT code generation for some enumerable type, such as ObservableGroup<TKey, TElement>:

Generated CCW vtable (click to expand):
```csharp internal sealed class ObservableGroupedCollection_TKey__TElement_WinRTTypeDetails : IWinRTExposedTypeDetails { public ComWrappers.ComInterfaceEntry[] GetExposedInterfaces() { _ = IReadOnlyList_System_Collections_IList.Initialized; _ = IReadOnlyList_System_ComponentModel_INotifyPropertyChanged.Initialized; _ = IReadOnlyList_System_Collections_Specialized_INotifyCollectionChanged.Initialized; _ = IReadOnlyList_System_Collections_IEnumerable.Initialized; _ = IReadOnlyList_object.Initialized; _ = IEnumerable_System_Collections_IList.Initialized; _ = IEnumerable_System_ComponentModel_INotifyPropertyChanged.Initialized; _ = IEnumerable_System_Collections_Specialized_INotifyCollectionChanged.Initialized; _ = IEnumerable_System_Collections_IEnumerable.Initialized; _ = IEnumerable_object.Initialized; return new ComWrappers.ComInterfaceEntry[14] { new ComWrappers.ComInterfaceEntry { IID = IReadOnlyListMethods.IID, Vtable = IReadOnlyListMethods.AbiToProjectionVftablePtr }, new ComWrappers.ComInterfaceEntry { IID = IReadOnlyListMethods.IID, Vtable = IReadOnlyListMethods.AbiToProjectionVftablePtr }, new ComWrappers.ComInterfaceEntry { IID = IReadOnlyListMethods.IID, Vtable = IReadOnlyListMethods.AbiToProjectionVftablePtr }, new ComWrappers.ComInterfaceEntry { IID = IReadOnlyListMethods.IID, Vtable = IReadOnlyListMethods.AbiToProjectionVftablePtr }, new ComWrappers.ComInterfaceEntry { IID = IReadOnlyListMethods.IID, Vtable = IReadOnlyListMethods.AbiToProjectionVftablePtr }, new ComWrappers.ComInterfaceEntry { IID = IEnumerableMethods.IID, Vtable = IEnumerableMethods.AbiToProjectionVftablePtr }, new ComWrappers.ComInterfaceEntry { IID = IEnumerableMethods.IID, Vtable = IEnumerableMethods.AbiToProjectionVftablePtr }, new ComWrappers.ComInterfaceEntry { IID = IEnumerableMethods.IID, Vtable = IEnumerableMethods.AbiToProjectionVftablePtr }, new ComWrappers.ComInterfaceEntry { IID = IEnumerableMethods.IID, Vtable = IEnumerableMethods.AbiToProjectionVftablePtr }, new ComWrappers.ComInterfaceEntry { IID = IEnumerableMethods.IID, Vtable = IEnumerableMethods.AbiToProjectionVftablePtr }, new ComWrappers.ComInterfaceEntry { IID = IListMethods.IID, Vtable = IListMethods.AbiToProjectionVftablePtr }, new ComWrappers.ComInterfaceEntry { IID = INotifyCollectionChangedMethods.IID, Vtable = INotifyCollectionChangedMethods.AbiToProjectionVftablePtr }, new ComWrappers.ComInterfaceEntry { IID = INotifyPropertyChangedMethods.IID, Vtable = INotifyPropertyChangedMethods.AbiToProjectionVftablePtr }, new ComWrappers.ComInterfaceEntry { IID = IEnumerableMethods.IID, Vtable = IEnumerableMethods.AbiToProjectionVftablePtr } }; } } ```

This, along with all of those IReadOnlyList_System_Collections_IList etc. implementations (each of those is and lot of code as well). With this proposed feature, we'd be able to not generate any of that, and simply do this instead:

[assembly: WinRTGenericTypeInstantiation(typeof(IReadOnlyList<IList>))]
[assembly: WinRTGenericTypeInstantiation(typeof(IReadOnlyList<INotifyPropertyChanged>))]
// etc.

For any generic type instantiation we need. Then, replace those Initialized accesses with:

[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name = "get_Initialized")]
[UnsafeAccessorType("WinRT.GenericHelpers.IReadOnlyList_System_Collections_IList, *App*")]
static extern bool IReadOnlyList_System_Collections_IList();

And just call those, which would call the shared generated generic type instantiation in the final assembly. This support could technically us to go even further and potentially just have the entire vtable implementation call to some well known generated method in the final assembly, if we found that method to be even better for CsWinRT (either would work).

Essentially, this allows us to:

  • Share all these instantiations
  • Minimize the generated code that has to be updated in all dependencies
  • Keep everything fully trimmable when in self-contained mode
  • Still work in ALC scenarios
  • Be predictable (the behavior depends on the feature switch, so it's consistent)

Additional information

This design includes suggestions from @jkoritzinsky, who pointed out that in several scenarios around ALC, there is no "entry assembly", or you might have a given dependency library being loaded in one ALC and then referenced by some other code in another ALC. In those cases, the "Dynamic" mode would allow eg. CsWinRT to implement an appropriate resolver callback.

cc. @AaronRobinsonMSFT @jkotas @MichalStrehovsky @agocke

dotnet-policy-service[bot] commented 2 months ago

Tagging subscribers to this area: @dotnet/area-system-runtime-compilerservices See info in area-owners.md if you want to be subscribed.

AaronRobinsonMSFT commented 2 months ago

The https://github.com/dotnet/runtime/issues/90081 propsal is written in a very narrow way. The input string would simply be passed to Type.GetType(string typeName) and load the target type. This proposal would increase that complexity substantially. Instead of expecting simple Type strings that can easily be computed in the runtime, we would be creating a type of Domain Specific Language (DSL) that we'd need to parse and then subsequently reconstruct with a new name.

The Static version seems okay, but does come with serious costs around manipulating the supplied typeName, which I would prefer to avoid. I could see a case where UnsafeAccessorTypeAttribute has a seperate AssemblyName property/enum that could be set, but that again means we are expected to construct a type name out of parts.

The Dynamic version is ill-advised because calling the same function would seem to change the content of the method. This would be a redesign of the entire feature. Once an UnsafeAccessor is generated, the code is set. This proposal would mean we would need to look up the stack determine what we are calling to know what to generate alternatively we could bake that look up in the generated stub, but that is quite expensive too. I'd be very much against this as it would impose substantial complexity.

I will need to read over the issues here, but the Dynamic version is a non-starter in my opinion. The Static one seems feasible.

jkoritzinsky commented 2 months ago

In my initial thinking, the Dynamic mode would call the resolution callback once per UnsafeAccessor method (when the IL is generated and the target is resolved). It wouldn't call it every time the method is called.

jkotas commented 2 months ago

Why is it not an option to use a fixed assembly name to generate the app specific WinRT interop?

AaronRobinsonMSFT commented 2 months ago

In my initial thinking, the Dynamic mode would call the resolution callback once per UnsafeAccessor method (when the IL is generated and the target is resolved). It wouldn't call it every time the method is called.

Okay, so that is the latter case. For me the concern comes down to I can call the function with the same inputs, but its behavior changes if Assembly A calls it vs if Assembly B calls it. That is how I read that feature and I'm not a fan. It creates a very complex scenario. Everytime an UnsafeAccessor is JIT, we need to determine details about the caller, that seems less than ideal.

jkoritzinsky commented 2 months ago

In my initial thinking, the Dynamic mode would call the resolution callback once per UnsafeAccessor method (when the IL is generated and the target is resolved). It wouldn't call it every time the method is called.

Okay, so that is the latter case. For me the concern comes down to I can call the function with the same inputs, but its behavior changes if Assembly A calls it vs if Assembly B calls it. That is how I read that feature and I'm not a fan. It creates a very complex scenario. Everytime an UnsafeAccessor is JIT, we need to determine details about the caller, that seems less than ideal.

It wouldn't change based on caller of the UnsafeAccessor method. Assembly A and B would both have their own UnsafeAccessor methods that happen to be described the same but declared in different assemblies.

In any case, @jkotas's idea would be significantly more straightforward if it would solve the scenario.

AaronRobinsonMSFT commented 2 months ago

It wouldn't change based on caller of the UnsafeAccessor method. Assembly A and B would both have their own UnsafeAccessor methods that happen to be described the same but declared in different assemblies.

I don't think I understand this. So let's say assembly A and assembly B both have the following public type:

public class Accessor
{
    [UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name = "get_Initialized")]
    [UnsafeAccessorType("WinRT.GenericHelpers.IReadOnlyList_System_Collections_IList, *App*")]
    public static extern bool IReadOnlyList_System_Collections_IList();
}

If I call A.Accessor.IReadOnlyList_System_Collections_IList() vs B.Accessor.IReadOnlyList_System_Collections_IList(), will they give me different things or the same thing? From what you're saying it sounds like the different because they are using the ALC they are loaded in to resolve, right?

AaronRobinsonMSFT commented 2 months ago

would be significantly more straightforward if it would solve the scenario.

Why won't it? I thought I had proposed this offline a while ago as well.

Sergio0694 commented 2 months ago

"Why is it not an option to use a fixed assembly name to generate the app specific WinRT interop?"

Mmh could you elaborate on how would this work? Suppose I have some project B (either an app, or perhaps a WinRT component), where I reference library A.dll, which has that [UnsafeAccessor] for some well known interop .dll it expects. I now build/deploy/publish project B. Who produces that interop .dll, and how? 🤔

jkoritzinsky commented 2 months ago

Project B would have targets that run after CoreCompile that would generate and compile a sidecar assembly (likely with a temp csproj like how WPF XAML works today) that is only referenced by UnsafeAccessor methods. It would be copied out to the build and publish directories, added to the deps.json, and passed to crossgen2 and ILC as a reference.

Sergio0694 commented 2 months ago

Mmh I see, that's interesting. Just so I understand, any particular reason why this would have to be after CoreCompile and then manually copied and added to deps.json, etc.? Could it not be compiled before CoreCompile and then the assembly added as <Reference> to the actual application project being built, so that everything else would "just work"?

jkoritzinsky commented 2 months ago

You'd need to do it after CoreCompile if you want to analyze the "app" project easily and have it's generic helpers in this shared sidecar assembly as well.

Sergio0694 commented 2 months ago

I see. Ok this idea is growing on me, and I could see this giving us a way to introduce all kinds of global program view optimizations for CsWinRT in the future. Is there some documented way to do this that we could take a dependency on in CsWinRT? That is, this part slightly worries me:

"It would be copied out to the build and publish directories, added to the deps.json, and passed to crossgen2 and ILC as a reference."

Are there public docs on what properties we'd need to set, where exactly we'd need to copy that .dll to, etc.? Related: would it make sense to add some built-in SDK/MSBuild support for this type of thing to do it automatically?

jkoritzinsky commented 2 months ago

This would basically be done with MSBuild targets like how WPF does it.

You'd want to hook after CoreCompile or before GenerateBuildDependencyFile and generate a csproj file in a subdirectory of the obj folder. Then build it and add the resulting assembly as a ReferencePath (you might need Private=true as well, not sure). Then that should flow through into the deps file, get copied into the right places, and passed to R2R and ILC.

There shouldn't be any .NET SDK work necessary here.

AaronRobinsonMSFT commented 1 month ago

@Sergio0694 Based on the conversation about it doesn't look like this addition to the API is going to be considered. Feel free to re-open if needed, but I'm closing for now.

Sergio0694 commented 1 month ago

Yup, that works for me! We'll be working on an implementation using an extension .dll as suggested 🙂 Thank you everyone for the help!