dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.93k stars 4.64k forks source link

EventSource use of EventAttribute triggers dynamic methods in attribute usage #90405

Open stephentoub opened 1 year ago

stephentoub commented 1 year ago

For an empty Main method, and enabling DOTNET_JitDisasmSummary, we see all of these:

   8: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Level(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=27]
   9: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Message(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=28]
  10: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Task(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=27]
  11: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Opcode(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=27]
  12: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Version(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=28]
  13: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Keywords(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=28]
  14: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Level(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=40]
  15: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Message(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=26, code size=37]
  16: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Task(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=40]
  17: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Opcode(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=40]
  18: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Version(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=41]
  19: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Keywords(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=41]

EventSource is looking for [Event] on a type using GetCustomAttributes, which uses MethodInfo.Invoke to populate the attribute, and since there's more than one attribute, the setters all get invoked twice, triggering the MethodInfo.Invoke to use reflection emit and dynamically generate a method to handle each setter.

On my machine, this is costing ~10% of time-to-Main. When I disable the dynamic method approach, startup improves by ~5ms.

ghost commented 1 year ago

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch See info in area-owners.md if you want to be subscribed.

Issue Details
For an empty Main method, and enabling DOTNET_JitDisasmSummary, we see all of these: ``` 8: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Level(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=27] 9: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Message(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=28] 10: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Task(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=27] 11: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Opcode(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=27] 12: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Version(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=28] 13: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Keywords(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=28] 14: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Level(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=40] 15: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Message(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=26, code size=37] 16: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Task(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=40] 17: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Opcode(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=40] 18: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Version(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=41] 19: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Keywords(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=41] ``` EventSource is looking for `[Event]` on a type using `GetCustomAttributes`, which uses MethodInfo.Invoke to populate the attribute, and since there's more than one attribute, the setters all get invoked twice, triggering the MethodInfo.Invoke to use reflection emit and dynamically generate a method to handle each setter. On my machine, this is costing ~10% of time-to-Main. When I disable the dynamic method approach, startup improves by ~5ms.
Author: stephentoub
Assignees: -
Labels: `area-CodeGen-coreclr`
Milestone: -
ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/area-system-reflection See info in area-owners.md if you want to be subscribed.

Issue Details
For an empty Main method, and enabling DOTNET_JitDisasmSummary, we see all of these: ``` 8: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Level(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=27] 9: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Message(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=28] 10: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Task(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=27] 11: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Opcode(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=27] 12: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Version(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=28] 13: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Keywords(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=28] 14: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Level(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=40] 15: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Message(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=26, code size=37] 16: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Task(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=40] 17: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Opcode(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=40] 18: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Version(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=41] 19: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Keywords(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=41] ``` EventSource is looking for `[Event]` on a type using `GetCustomAttributes`, which uses MethodInfo.Invoke to populate the attribute, and since there's more than one attribute, the setters all get invoked twice, triggering the MethodInfo.Invoke to use reflection emit and dynamically generate a method to handle each setter. On my machine, this is costing ~10% of time-to-Main. When I disable the dynamic method approach, startup improves by ~5ms.
Author: stephentoub
Assignees: -
Labels: `area-System.Reflection`, `untriaged`
Milestone: -
steveharter commented 1 year ago

One approach to addressing some of this is to implement https://github.com/dotnet/runtime/issues/75357 which would share previously generated IL if the signature is the same and the method is not virtual.

Otherwise, there are other options to consider such as detecting the startup phase in the runtime, and having reflection respect that so it doesn't generate the invoke stubs.

ericstj commented 1 year ago

@stephentoub does the logging here happen because we're enabling DOTNET_JitDisasmSummary or is it happening all the time? In other words - can we try and avoid the logging codepath during startup?

stephentoub commented 1 year ago

does the logging here happen because we're enabling DOTNET_JitDisasmSummary

It's not related to DOTNET_JitDisasmSummary; the JIT handles that directly, not managed code. EventSource is unrelated to that.

or is it happening all the time?

Invariably on Windows there's something somewhere on the box that's enabled an ETW session, which then causes every .NET app to trip over this, with NativeRuntimeEventSource being created on the startup path.

steveharter commented 1 year ago

Moving to 9.0 however I'll investigate a solution now and we can always backport. The regression occurred in v7 (not v8).

Invariably on Windows there's something somewhere on the box that's enabled an ETW session, which then causes every .NET app to trip over this, with NativeRuntimeEventSource being created on the startup path.

Another approach is to determine if the ETW logging during startup is necessary.

steveharter commented 1 year ago

On my machine, this is costing ~10% of time-to-Main

@stephentoub how was this measured? I tried launching corerun against simple console app and didn't see any improvments when only using interpreted invoke. Typical perf result like this: Method Mean Error StdDev Median Min Max Allocated
AFTER 5.290 ms 0.1116 ms 0.1240 ms 5.271 ms 5.100 ms 5.571 ms 565 B
BEFORE 5.306 ms 0.1625 ms 0.1806 ms 5.309 ms 5.031 ms 5.628 ms 565 B
steveharter commented 1 year ago

With Jit results showing fewer items: BEFORE

   1: JIT compiled System.Guid:FormatGuidVector128Utf8(System.Guid,ubyte) [Tier0, IL size=322, code size=584]
   2: JIT compiled System.HexConverter:AsciiToHexVector128(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]) [Tier0, IL size=78, code size=359]
   3: JIT compiled System.Runtime.Intrinsics.Vector128:ShuffleUnsafe(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]) [Tier0, IL size=41, code size=50]
   4: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Level(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=27]
   5: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Message(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=28]
   6: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Task(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=27]
   7: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Opcode(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=27]
   8: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Version(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=28]
   9: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Keywords(System.Object,System.Object,ulong) [FullOpts, IL size=25, code size=28]
  10: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Level(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=40]
  11: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Message(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=26, code size=37]
  12: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Task(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=40]
  13: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Opcode(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=40]
  14: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Version(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=41]
  15: JIT compiled (dynamicClass):InvokeStub_EventAttribute.set_Keywords(System.Object,System.Span`1[System.Object]) [FullOpts, IL size=36, code size=41]
  16: JIT compiled StartupApp.Program:Main(System.String[]) [Tier0, IL size=3, code size=15]

AFTER

   1: JIT compiled System.Guid:FormatGuidVector128Utf8(System.Guid,ubyte) [Tier0, IL size=322, code size=584]
   2: JIT compiled System.HexConverter:AsciiToHexVector128(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]) [Tier0, IL size=78, code size=359]
   3: JIT compiled System.Runtime.Intrinsics.Vector128:ShuffleUnsafe(System.Runtime.Intrinsics.Vector128`1[ubyte],System.Runtime.Intrinsics.Vector128`1[ubyte]) [Tier0, IL size=41, code size=50]
   4: JIT compiled StartupApp.Program:Main(System.String[]) [Tier0, IL size=3, code size=15]
steveharter commented 1 year ago

Linking related issue https://github.com/dotnet/runtime/issues/85791

stephentoub commented 1 year ago

Typical perf result like this:

What are you measuring that's only taking 5ms? That's not coreclr startup.

@stephentoub how was this measured?

IIRC, I built once with main and once with the codegen code path commented out, and then used Measure-Command to time many empty console app invocations.

steveharter commented 1 month ago

Possible design that is extensible: