KirillOsenkov / MSBuildStructuredLog

A logger for MSBuild that records a structured representation of executed targets, tasks, property and item values.
MIT License
1.41k stars 188 forks source link

[Proposal] Add dumprecords command #745

Closed JanKrivanek closed 5 months ago

JanKrivanek commented 5 months ago

Fixes #741

Context

741 brought a nice idea. I needed something similar for my investigation - just counts, not sizes - so I stitched something for both.

Samples

Just summary on OrchardCore (runs for about 0.7s on my machine):

>binlogtool dumprecords OrchardCore.binlog --include-total --include-rollup --exclude-details

Total records: 1069320

RecordType, Count, Size[bytes]
==============================
Message, 568034, 38.46
NameValueList, 38359, 60.16
BuildStarted, 1, 12.00
ProjectEvaluationStarted, 417, 35.90
ProjectImported, 64414, 55.11
AssemblyLoad, 22, 55.59
PropertyReassignment, 23492, 41.90
ProjectEvaluationFinished, 417, 1912.71
ProjectStarted, 14300, 56.33
TargetStarted, 49962, 38.80
TaskStarted, 23227, 36.90
TaskFinished, 23227, 34.15
TargetFinished, 49962, 52.04
TaskParameter, 145155, 98.90
EnvironmentVariableRead, 792, 37.90
TargetSkipped, 53035, 58.53
ProjectFinished, 14300, 31.30
TaskCommandLine, 198, 40.51
Error, 5, 46.00
BuildFinished, 1, 14.00

Details for binlog from new console:

>binlogtool dumprecords

RecordType, Size[bytes]
=======================
Message, 50
Message, 50
NameValueList, 9
BuildStarted, 12
Message, 46
Message, 46
Message, 46
Message, 46
Message, 46
ProjectEvaluationStarted, 34
AssemblyLoad, 54
AssemblyLoad, 54
AssemblyLoad, 54
AssemblyLoad, 54
Message, 41
ProjectImported, 45
ProjectImported, 49
Message, 39
...

Limitations

Breaking changes

Perf

Summary for OrchardCore runs ~0.178 seconds with the new binlogs (offsets included), ~1.967 seconds with old binlogs. Both after warmup runs. On full details dumping the difference is not so visible, as majority of the time is spend flushing the records (it's ~56 vs ~58 seconds)

KirillOsenkov commented 5 months ago

Does it support the full list of all records (not the summary)? Useful for diffing. Even just the types of records would be valuable so that you can diff two .txt files and see what is the first record where they start to differ.

Would be nice to support formats before 18 too. We can compute records lengths like ReadRecords does, see my draft in #741

JanKrivanek commented 5 months ago

@KirillOsenkov - All comments sounds sound and easily accomodable - I'll work on this in a few days.

As for:

Does it support the full list of all records (not the summary)? Useful for diffing. Even just the types of records would be valuable > so that you can diff two .txt files and see what is the first record where they start to differ.

The ungrouped list is the default mode. I added the optional groupping as that's what I needed for my other investigation. Though - do you expect the sizes to differ a bit? Would you want to have an option to dump just the types, without sizes?

KirillOsenkov commented 5 months ago

When I have to investigate a corrupt binlog, I usually have the good and the bad binlogs. So I dump a flat list of records to see which record is the first broken one. Even the record number and kind would be valuable, but if we can have the size of each record it'd be better (for completeness). If the size of the corrupted record changes, would be nice to know it.

We could deduce the record size from the start positions of records in the stream (this is how we show the progress bar when reading the binlog). But of course in the new format we get the record sizes for free.

KirillOsenkov commented 5 months ago

Published https://www.nuget.org/packages/binlogtool/1.0.13