haskell / ghc-events

Library and tool for parsing .eventlog files from GHC
http://www.haskell.org/haskellwiki/ThreadScope
Other
33 stars 34 forks source link

Handling EventBlocks and roundtripping #99

Open TeofilC opened 1 year ago

TeofilC commented 1 year ago

The eventlog is structured as a list of blocks of events.

A block has a capability number that specifies the capability of upcoming events, and some information about when the block was written.

Currently we erase block events when reading the eventlog. This leads to two issues:

My proposal is to keep the block events during parsing and require their presence when writing out eventlogs. This introduces some new illegal states, ie, an eventlog without block events could not now be written to a file. How does this sound?

An alternative is to change the types to make these states unrepresentable but I don't think the breaking change from that would be worth it.

Mikolaj commented 8 months ago

That sounds good to me as a past Threadscope contributor, but we'd probably need feedback from current heavy eventlog users, e.g., @mpickering. Who would be the main consumer of the new feature?

TeofilC commented 8 months ago

I think the main consumer would be tools that want to figure out mutator time. Currently we expose information about GC pauses but don't expose information about event log flush pauses. This information could also be added to Threadscope for instance.

There's also quite an old GHC ticket asking for this https://gitlab.haskell.org/ghc/ghc/-/issues/11950. Surprisingly this feature was already implemented in the eventlog when this ticket was opened (!) but just not exposed by GHC-events.

The other appeal is that it makes it a bit easier to process event logs in a streaming way. For instance, currently the API doesn't expose a way to write an eventlog without sorting all the events (in order to create dummy eventblocks). I recently ran into this when trying to filter out a small time range from a very large eventlog. It would also make it a bit easier to process events in order without sorting the eventlog, though it could be done without this too.