microsoft / perfview

PerfView is a CPU and memory performance-analysis tool
http://channel9.msdn.com/Series/PerfView-Tutorial
MIT License
4.14k stars 707 forks source link

Truly Dynamic Event #2051

Closed cshung closed 2 months ago

cshung commented 3 months ago

This PR is ready to be reviewed. (It is easier to review by commits)

The idea of truly dynamic event is to make GCDynamicEvent truly dynamic, in the sense that the GC team can modify the runtime to emit any dynamic event, and then we can consume them in a reasonable API without modifying TraceEvent at all.

The primary use case for this is through a .NET Interactive notebook, where:

  1. The notebook user specifies a list of DynamicEventSchema that specify the names of the events the schemas handle as well as the field and types, and then
  2. Be able to access them through the TraceGC.DynamicEvent() extension method as if they were parsed, using the C# dynamic keyword magic.

Beside the obvious advantage to reduce work, it also allow a more agile practice. Previously, whenever we release a new event, we have to stay backward compatible with respect to things like field name and types and how values are interpreted, etc. With the design right now, we can selectively only expose event in the traditional way if they were meant to be stable and useful to others, and expose it the dynamic way if it is otherwise. Customers are warned that these event schemas are subjected to change without notice, as a matter of fact we won't even release them. They aren't meant to be private (you can easily skim them from the source), they merely meant without back compat guarantees.

This PR is meant to be used in https://github.com/dotnet/performance/pull/4274

Maoni0 commented 3 months ago

the dynamic events should be in a List, not a Dictionary. all you need to communicate to the notebooks side is for this GC, here are timestamps for the dynamic event it observed and here's the info for these events.

"info" could even mean just a blob of data that's fired in this event and the notebooks side can parse everything including name/version/payload. do you see a problem with this approach? is there a need for TraceEvent to even parse the name/version?

I would also suggest to try out using this in a notebook just to make sure it actually works, if you haven't.

cshung commented 3 months ago

the dynamic events should be in a List, not a Dictionary. all you need to communicate to the notebooks side is for this GC, here are timestamps for the dynamic event it observed and here's the info for these events.

"info" could even mean just a blob of data that's fired in this event and the notebooks side can parse everything including name/version/payload. do you see a problem with this approach? is there a need for TraceEvent to even parse the name/version?

I would also suggest to try out using this in a notebook just to make sure it actually works, if you haven't.

Not at all, the parsing is really all meant for usability so that you can write:

gc.DynamicEvents.SizeAdaptationTuning.new_n_heaps

instead of

BitConverter.ToUInt16(gc.dynamicEventIndex.Single(r => r.Name.Equals("SizeAdaptationTuning")).Payload, 4)

I still think that the parsing is valuable, but TraceEvent might not be the right home for it. I was thinking about moving them to GC.Infrastructure instead, which allows for quick iteration. We can use extension method and the DynamicEvents property don't need to reside on the TraceGC object itself. For that I would talk to @MokoSan first.

@MokoSan and I will try out the notebook to make sure it works there.

The parsing of the names are currently needed because we want to filter out the known events (e.g. CommittedUsage) regardless, but TraceEvent don't need to index them by name, we can totally just do a plain list of RawDynamic objects (a good name is wanted for that)

Maoni0 commented 3 months ago

I still think that the parsing is valuable, but TraceEvent might not be the right home for it.

parsing is of course valuable and the question is exactly where it should be. I was suggesting we didn't need to do any parsing in TraceEvent but you pointed out we needed to parse the event name on the TraceEvent side to filter out supported dynamic events so it sounds like you are saying only the name parsing is needed on the TraceEvent side and everything else should be done on the GC analysis side (we can decide if that's in the c# library or the notebooks).

brianrob commented 3 months ago

Just looking through this PR, and I want to make sure that I understand the goal correctly. Is the goal here to keep the existing "known" dynamic events, but otherwise allow access to the unknown dynamic events without having TraceEvent understand their metadata (and you just parse them in your own code)?

Maoni0 commented 3 months ago

Just looking through this PR, and I want to make sure that I understand the goal correctly. Is the goal here to keep the existing "known" dynamic events, but otherwise allow access to the unknown dynamic events without having TraceEvent understand their metadata (and you just parse them in your own code)?

@brianrob it's to keep the supported dynamic events (right now we have the commit usage event which we do support, meaning tools can use them and depend on them - if we change them we'll bump up the versions; and we might add more later) and allow access to the unsupported ones that may change at any time which we only consume for the GC internal analysis.

cshung commented 3 months ago

Just for context, this work is inspired because we wanted additional diagnostics information for DATAS as in this PR.

https://github.com/dotnet/runtime/pull/103405

With that change, the runtime will never emit HeapCountTuning and HeapCountSample events anymore.

For that, I would lilke to depreciate HeapCountTuning and HeapCountSample in favor of this new approach. Ideally I will just delete them. While theoretically that will be a breaking change, I really doubt if anyone actually depends on them in practice, considering that:

  1. These events are emitted only when DATAS is turned on in .NET 8 or 9 early previews,
  2. These events emits rather cryptic values that a general person who don't work in the GC probably doesn't know what they are, and
  3. As a general person, they can probably do nothing about them. It is not like they can tune the heuristic implementation or modify any configuration for that.
brianrob commented 3 months ago

Just for context, this work is inspired because we wanted additional diagnostics information for DATAS as in this PR.

dotnet/runtime#103405

With that change, the runtime will never emit HeapCountTuning and HeapCountSample events anymore.

For that, I would lilke to depreciate HeapCountTuning and HeapCountSample in favor of this new approach. Ideally I will just delete them. While theoretically that will be a breaking change, I really doubt if anyone actually depends on them in practice, considering that:

  1. These events are emitted only when DATAS is turned on in .NET 8 or 9 early previews,
  2. These events emits rather cryptic values that a general person who don't work in the GC probably doesn't know what they are, and
  3. As a general person, they can probably do nothing about them. It is not like they can tune the heuristic implementation or modify any configuration for that.

I don't have a problem with you removing them entirely if they are no longer relevant. I agree that I am not super concerned with such a break.

brianrob commented 3 months ago

rted ones that may change at any time which we only consume for the GC i

Gotcha, thanks.

brianrob commented 3 months ago

If you do end up removing the existing supported dynamic events, and don't add more, I am wondering if any change to TraceEvent other than the removal is required? If you do all the parsing outside of TraceEvent, then I suspect you just need access to the raw event payload itself.

cshung commented 3 months ago

If you do all the parsing outside of TraceEvent, then I suspect you just need access to the raw event payload itself.

Yes, the new iteration I pushed today do just that. For the unparsed events, fire them through the GCDynamicTraceEvent and so TraceGC can expose them, that's all.

cshung commented 2 months ago

@brianrob, this change is ready to be reviewed and merged. Can you please take a look?