KirillOsenkov / MSBuildStructuredLog

A logger for MSBuild that records a structured representation of executed targets, tasks, property and item values.
MIT License
1.41k stars 188 forks source link

Search is broken in 2.1.814 #673

Closed Scottj1s closed 1 year ago

Scottj1s commented 1 year ago

I have a project that repeatedly generates binlogs that the viewer cannot search. Neither Search Log nor Properties and Items turn up any results for common items, such as ProjectName. I have another local branch of the same project (but with a few mods) that does not have this problem. Both repos generate roughly the same size binlog, 3.5MB.

rainersigwald commented 1 year ago

I'm also seeing this, it appears new in 2.1.814 (same binlog in 2.1.790 searches OK). Nothing in the diff (https://github.com/KirillOsenkov/MSBuildStructuredLog/compare/v2.1.790...v2.1.814) is jumping out at me.

Scottj1s commented 1 year ago

Yes, this is a regression with 2.1.814 - confirmed by downloading an older version. Updated title.

To temporarily disable updates after reverting to 790, create the following empty marker file: %LOCALAPPDATA%\MSBuildStructuredLogViewer\app-2.1.790\DisableUpdates.txt

rainersigwald commented 1 year ago

git bisect says 9e590736dd5d563b3bf1c5dfd977525111e6138d is the first bad commit. I do not see how, staring at it.

fyi @antonioaversa

agr commented 1 year ago

For me if I run the log viewer by itself (not double clicking a .binlog file) and open the file from there, search works fine. If I double-click .binlog file it does not.

KirillOsenkov commented 1 year ago

Starting to look at this. The viewer relies heavily on string identity for search. Let's see if ConcurrentDictionary has different semantics than regular dictionary. We may want to roll back and then fix the race conditions seen by @antonioaversa in a different way.

KirillOsenkov commented 1 year ago

I'm looking at the sources of ConcurrentDictionary and my head hurts. The semantics of locking on parts of a data structure seem suspicious - feels like there needs to be a global lock.

I also have a hunch that my previous solution (https://github.com/KirillOsenkov/MSBuildStructuredLog/commit/d1c4eff09277acf4f5afb0aae47180bf077ec442) should have fixed the race conditions just fine, but it wasn't actually deployed.

For now I'm going to roll the ConcurrentDictionary back and we're going to scratch our heads later.

KirillOsenkov commented 1 year ago

Is anyone around to validate that https://github.com/KirillOsenkov/MSBuildStructuredLog/commit/334fc34ea4857b9d1ac11ede087628cfb5417e10 fixes it?

KirillOsenkov commented 1 year ago

Published 2.1.815, let's take it for a spin.

agr commented 1 year ago

2.1.815 works for me.

KirillOsenkov commented 1 year ago

OH, I know what is happening.

It's subtle.

There's a semantic difference between Dictionary<TKey, TValue>.Keys and ConcurrentDictionary<TKey, TValue>.Keys.

The first one returns the same mutable collection for the entire lifetime of the dictionary (and it changes in sync with the dictionary).

ConcurrentDictionary returns new immutable snapshot collections depending on when you ask it (makes sense, to guarantee concurrency via immutability)

Compare https://referencesource.microsoft.com/#mscorlib/system/collections/generic/dictionary.cs,135 vs. https://referencesource.microsoft.com/#mscorlib/system/Collections/Concurrent/ConcurrentDictionary.cs,2052

The search function accessed the Keys collection, but we only snap it at the very beginning once: https://github.com/KirillOsenkov/MSBuildStructuredLog/blob/334fc34ea4857b9d1ac11ede087628cfb5417e10/src/StructuredLogger/Serialization/StringCache.cs#L13

To use the ConcurrentDictionary we'd need to change it to => to return the up-to-date Keys collection on every access.

At least now we understand why search is broken. The next step is to double-check whether the old implementation with locking around a simple dictionary is fine, or whether there is still an underlying race condition. I think it's fine.

But if we wanted to switch to ConcurrentDictionary we'd need to provide the ability to provide the custom string set via SetStrings() as well as ensure we access the latest Keys.

Also we'd need to measure the perf impact of both binlog loading time as well as search time.

Scottj1s commented 1 year ago

@KirillOsenkov thanks for the rapid turn around!