dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.26k stars 4.73k forks source link

[mono][wasm] TrimMode=full is trimming UnmanagedCallersOnly #101434

Open mkhamoyan opened 6 months ago

mkhamoyan commented 6 months ago

Description

After https://github.com/dotnet/runtime/pull/100288 still for browser UnmanagedCallersOnly is not called after publish.

Reproduction Steps

create browser sample app

    public unsafe partial class Test
    {
        public unsafe static int Main(string[] args)
        {
            UnmanagedFunc();
            return 0;
        }

      [UnmanagedCallersOnly(EntryPoint = "ManagedFunc")]
      public static int ManagedFunc(int number)
      {
          // called from UnmanagedFunc
          Console.WriteLine($"ManagedFunc({number}) -> 42");
          return 42;
      }

      [DllImport("local", EntryPoint = "UnmanagedFunc")]
      public static extern void UnmanagedFunc(); // calls ManagedFunc
   }

and in local.c

#include <stdio.h>

int ManagedFunc(int number);

void UnmanagedFunc()
{
    int ret = 0;
    printf("UnmanagedFunc calling ManagedFunc\n");
    ret = ManagedFunc(123);
    printf("ManagedFunc returned %d\n", ret);
}

Expected behavior

ManagedFunc should be called

Actual behavior

No build errors, in pinvoke-table.h wrapper is there, but ManagedFunc is not being called.

Regression?

No response

Known Workarounds

Calling it explicitly

((IntPtr)(delegate* unmanaged<int,int>)&ManagedFunc).ToString();

Configuration

No response

Other information

No response

dotnet-policy-service[bot] commented 6 months ago

Tagging subscribers to 'arch-wasm': @lewing See info in area-owners.md if you want to be subscribed.

dotnet-policy-service[bot] commented 6 months ago

Tagging subscribers to this area: @dotnet/interop-contrib See info in area-owners.md if you want to be subscribed.

lewing commented 6 months ago

NativeAOT appears to use UnmanagedCallersOnlyAssembly which feeds into https://github.com/dotnet/runtime/blob/dd24e9c3ebaf6cda91fc281657878b75428f178c/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/UnmanagedEntryPointsRootProvider.cs#L15 @sbomer is there something similar for the trimmer?

dotnet-policy-service[bot] commented 6 months ago

Tagging subscribers to this area: @dotnet/interop-contrib See info in area-owners.md if you want to be subscribed.

dotnet-policy-service[bot] commented 6 months ago

Tagging subscribers to this area: @agocke, @sbomer, @vitek-karas See info in area-owners.md if you want to be subscribed.

sbomer commented 6 months ago

ILLink doesn't have such support. Do we need to add support for UnmanagedCallersOnlyAssembly for this scenario to work? Or could the tooling that generates the interop logic for the UnmanagedCallersOnly method be made to pass along some extra roots to ILLink?

@MichalStrehovsky @AaronRobinsonMSFT

MichalStrehovsky commented 6 months ago

NativeAOT appears to use UnmanagedCallersOnlyAssembly which feeds into

NativeAOT would only export these if we're building a library. Is this building an executable that also acts as a library? Feels like a pretty special thing and I agree with Sven that maybe whatever is generating pinvoke-table.h could also generate a roots descriptor that could be passed to illink to keep these.

MichalStrehovsky commented 6 months ago

Alternatively if we expect to support trimmed native libraries at some point, a general purpose mechanism could be added to illink in preparation for that - I could see a new command argument allowing to specify a list of assemblies where UnmanagedCallersOnly methods with a non-empty EntryPoint should be considered as roots. This command line argument could be used for libraries, but also for these executables that also export things.

dotnet-policy-service[bot] commented 2 months ago

Tagging subscribers to this area: @agocke, @sbomer, @vitek-karas See info in area-owners.md if you want to be subscribed.

pavelsavara commented 2 months ago

It seems this is gap in ILink, please re-assign

MichalStrehovsky commented 2 months ago

It seems this is gap in ILink, please re-assign

Is https://github.com/dotnet/runtime/issues/101434#issuecomment-2075854243 not feasible?

pavelsavara commented 2 months ago

My understanding is that _GenerateManagedToNative happens after _RunILLink in Publish nested build. Therefore the method is already gone when we get to it there.

Binlog wasi-uco-publish.zip

But overall it feels like the UnmanagedCallersOnly is expressing the intent pretty well and that ILLink is the right tool which should honor that intent.

Those methods are created

We could mandate that it's not enough, for those tools to do UnmanagedCallersOnly + sometimes also WasmImportLinkageAttribute, and that they have to author the link descriptors too. Is that the right DX ?

For a library in a nuget, we still don't have solution for running _GenerateManagedToNative on that either.

What are the challenges to implement it in ILLInk ? What makes it "pretty special thing" ?

Why this is not a problem for NAOT on windows when they want to publish native DLL with exported native methods ? Are they not trimming it ?

MichalStrehovsky commented 2 months ago

Why this is not a problem for NAOT on windows when they want to publish native DLL with exported native methods ? Are they not trimming it ?

If I understand it correctly, WASM is hitting this when building an executable. UnmanagedCallersOnly methods are not considered exported roots on native AOT when building an executable either. They are considered roots only when building <OutputType>Library</OutputType>.

ILLink currently doesn't support trimming <OutputType>Library</OutputType>, AFAIK we don't have scenarios where it would be useful right now (these would not be downstream consumable) and nobody defined what trimming a self-contained library means. Without that, there's no need to even look at UnmanagedCallersOnly within ILLink (it would not be desirable to root UnmanagedCallersOnly with non-empty EntryPoint in the general case).

If I understand it correctly, this would be a one-off ILLink feature for WASM only. That's why the question of whether we can do it elsewhere so that WASM specific things stay together with WASM.

pavelsavara commented 2 months ago

Why this is not a problem for NAOT on windows when they want to publish native DLL with exported native methods ? Are they not trimming it ?

After discussing it offline, I understood that NAOT doesn't need to trim native DLL, because DLLs are never self-contained. Because they share the runtime installed elsewhere on the OS, the trimming doesn't make much sense.

ILLink currently doesn't support trimming <OutputType>Library</OutputType>

<OutputType>Library</OutputType> for Browser it makes sense to trim it when it's part of the whole application.

For WASI <OutputType>Library</OutputType> would mean something else than for the rest of dotnet. It would mean that we produce WASI component, which has self-contained dotnet runtime inside. If you have two such WASI components, they would not share the runtime but we would have two runtimes. For such component it makes sense to trim it as if it was the whole APP. The difference is that there is not just one entrypoint, but a WASI component could have multiple exports.

The next question is should we protect all UCOs ? I think the answer is yes, because they represent explicit public native/WASI API of the component.

And if we need to express more nuance to trim some UCOs, I think that ILLInk's feature="EnableOptionalFeature" featurevalue="false" is way how we can trim the functionality, including specific UCOs. It would be opt-in trimming, rather than opt-out.

If I understand it correctly, this would be a one-off ILLink feature for WASM only. That's why the question of whether we can do it elsewhere so that WASM specific things stay together with WASM.

I see that now, thanks. It still feels to me that ILLink is the right place to implement UCO rooting. So that we don't have to implement another msbuild pass, which would do cecil/reflection over all the assemblies one more time. We already have too many passes in WASM publish and it's already slow.

@sbomer do you have objections ? And guidance ?

sbomer commented 2 months ago

My main objection is that keeping all UCO in the app and its dependencies would lead to unnecessary bloat.

Are you aware of any scenarios where the managed code genuinely has no insight into whether the native code will call back into managed? In the sample app above, I think a better solution would be for UnmanagedFunc to have a DynamicDependencyAttribute pointing to ManagedFunc. Using that approach in a library would allow the both methods to be removed if the library code path is unused in a trimmed app.

The next question is should we protect all UCOs ? I think the answer is yes, because they represent explicit public native/WASI API of the component.

How will this work if I'm trying to implement a WASI component, and I have some nuget dependencies with UCO, but I don't want those to be exports of my WASI Component?

pavelsavara commented 2 months ago

Are you aware of any scenarios where the managed code genuinely has no insight into whether the native code will call back into managed?

I guess that's true with any WASI component which has more than one export. At build time, you don't know which of them would be used by other WASI component (for example written in python) that is consuming your component at runtime.

Also in case that there is nuget/assembly with UCOs + WasmImportLinkageAttribute. It means the nuget library wants to contribute into list of exports of the final WASI component. If such nuget would contain list of ILLink descriptors for all exports, we didn't learn anything new from such list.

More specific example I could imagine is WASI HTTP (server) handler, wrapper translating WASI API into C# API. Such nuget would probably contain WASI (UCO) export on behalf of the C# application logic assembly.

How will this work if I'm trying to implement a WASI component, and I have some nuget dependencies with UCO, but I don't want those to be exports of my WASI Component?

I suggest that UCOs should be protected by default and that trimming them would be opt-in via ILLink substitution/stub.

sbomer commented 2 months ago

Thank you, that helps me understand the scenarios a little better. A few more questions/comments:

I agree that when building a library, UCO should be preserved, so that they exist when building an application that depends on the library (but whether they should be trimmed as part of the app build is a separate question). If we added a library trimming mode, I think it would make sense to preserve UCO in that mode - so for the rest of my comments I'll assume we're talking about trimming an app or final WASI component.

Also in case that there is nuget/assembly with UCOs + WasmImportLinkageAttribute. It means the nuget library wants to contribute into list of exports of the final WASI component.

This isn't entirely clear to me. Could it not be the case that a library has UCO for interop with library-specific native code, but doesn't want to contribute it to the exports of the final WASI component?

I suggest that UCOs should be protected by default and that trimming them would be opt-in via ILLink substitution/stub.

We don't currently have a way to do this. The feature substitutions are able to:

But there's not a way to conditionally remove a method (other than making sure it's not called). We could probably come up with a way, but the model that currently makes most sense to me is to allow UCO methods to be removed by default. This allows libraries to preserve UCO methods only when certain code paths are reachable (for example via DynamicDependencyAttribute), allowing them to be removed if the library is unused or partially unused.

pavelsavara commented 2 months ago

Could it not be the case that a library has UCO for interop with library-specific native code

Yes, you are right that there could be UCOs aimed at native code and we don't know if that native code uses it or not. It is actually part of this issue. Right now you have to have link descriptor for those you want to use from native code.

I don't know how bad it would be if we rooted all of them. Are there (native) interop scenarios which generate UCOs opportunistically assuming it would be trimmed later ? (Leading to unexpected bloat as opposed to necessary functionality)

The feature substitutions are able to:

  • conditionally substitute method bodies to a constant

That should be enough to trim extra functionality which UCO rooted. I see now that it would not remove the UCO/export.

If such nuget would contain list of ILLink descriptors for all exports, we didn't learn anything new from such list.

@lewing suggested that wit-bindgen is in better position to categorize the methods into groups by interface they implement.

So that the app developer would have easier time to express what is to be preserved or trimmed by that group/interface. They would still have to do something (root or trim the group), it can't be automagic end to end.

Let's continue thinking about it.

SingleAccretion commented 2 months ago

After discussing it offline, I understood that NAOT doesn't need to trim native DLL, because DLLs are never self-contained. Because they share the runtime installed elsewhere on the OS, the trimming doesn't make much sense.

NAOT does "trim" code from DLLs, because from the NAOT perspective, the output is always a self-contained native artifact. The difference between executables and libraries then becomes only about what are the entrypoints.

The current NAOT model is that you need to specify the assemblies for which EntryPointed UCOs are to be considered roots. The assembly from the project you're publishing is considered to be in this set implicitly, and additional ones can be specified via UnmanagedEntryPointsAssembly (as yet undocumented). This works fine for NAOT native libraries because they're not widely used (yet), and the exports naturally tend to live in one central place - that top-level project.

It works less well for WASI components, where it's expected that you can have exported UCOs in referenced assemblies. It does not work for JSExports.

The NativeAOT-LLVM experience with this problem suggests that some more automatic solution is needed; this is easily the top recurring problem people open issues about. See e. g. https://github.com/dotnet/runtimelab/issues/2626.

SingleAccretion commented 2 months ago

This is to above point about the need for some UnmanagedEntryPointsAssembly-like mechanism:

We've talked a bit with Pavel on Discord, and it was pointed out that it would be undesirable to specify the capability for rooting UCOs on an "unconditional" basis in a way that would force the trimmer (or ILC) to look at all of the referenced assemblies for unmanaged exports, many of which don't need to be looked at today.

sbomer commented 2 months ago

Are there (native) interop scenarios which generate UCOs opportunistically assuming it would be trimmed later?

My understanding of the xamarin-macios managed static registrar, after reading through some of the discussion and code around https://github.com/dotnet/runtime/issues/80912, is that it:

lewing commented 2 months ago

@sbomer my understanding was that we want to move away from custom linker passes as a solution, are you suggesting that is the correct approach?

pavelsavara commented 2 months ago

Notes from discussion with @SingleAccretion earlier this week