dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.07k stars 4.69k forks source link

Create an internal sorted equatable collection type for incremental source generators #89318

Open layomia opened 1 year ago

layomia commented 1 year ago

Based on feedback from @eiriktsarpalis.

ImmutableEquatableArray<T> implements sequence equality. The JSON generator uses it in its specs to make incremental generation possible. However, spec collections often need to have set-like semantics to retain equality & avoid regeneration when there are compilation diffs that don't affect the effective inputs to the generator. IOW we want to avoid false negatives in the incremental cache.

An upcoming PR to enable incremental generation for the config binding generator (https://github.com/dotnet/runtime/issues/83534) will share this type. As a follow up to that PR, we should create a new type (say ImmutableEquatableSet<T>) that guarantees set semantics by construction, following the implementation of SortedList<T>. This would avoid false negatives in the incremental cache.

ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/area-extensions-configuration See info in area-owners.md if you want to be subscribed.

Issue Details
Based on [feedback](https://github.com/layomia/dotnet_runtime/commit/34db20bbec3e9965b8b8e7777caae59684f8ccb4#r122314896) from @eiriktsarpalis. `ImmutableEquatableArray` implements sequence equality. The JSON generator uses it in its [specs](https://github.com/dotnet/runtime/tree/a864ec70925181f00c38e15e4b1896f2584953c4/src/libraries/System.Text.Json/gen/Model) to make incremental generation possible. However, the lists often need to have [set-like semantics](https://github.com/dotnet/runtime/blob/a864ec70925181f00c38e15e4b1896f2584953c4/src/libraries/System.Text.Json/gen/JsonSourceGenerator.Parser.cs#L148) to retain equality & avoid regeneration when there are compilation diffs that don't affect the effective inputs to the generator. Currently, parsed elements (cached in intermediary data structures) are sorted before construction of the incremental lists / containing specs. An upcoming PR to enable incremental generation for the config binding generator (https://github.com/dotnet/runtime/issues/83534) will share this type, and also needs to sort elements before construction. As a follow up to that PR, we should create a new type (say `ImmutableEquatableSortedList`) that guarantees set semantics by construction following the implementation of `SortedList`. This would avoid intermediate lists, pre-sorting, and avoiding false negatives in the incremental cache.
Author: layomia
Assignees: -
Labels: `area-Extensions-Configuration`, `source-generator`
Milestone: 8.0.0
eiriktsarpalis commented 1 year ago

For simplicity we might just call it ImmutableEquatableSet<T> and I suspect we might eventually need an ImmutableEquatableDictionary too.

Tornhoof commented 1 year ago

Can we get an oob nuget package with the sources of such collections as content files?

At the moment most src gens simply copy the Array type around or fail to understand that they are practically necessary for proper caching.

Why Nuget with source files as content? Binary Nugets with src gens are quite complicated to get them working properly.

eiriktsarpalis commented 1 year ago

Can we get an oob nuget package with the sources of such collections as content files?

We should definitely consider something like this eventually, once the proposed new types have been tried and tested internally.

AaronRobinsonMSFT commented 1 year ago

@jkoritzinsky Any possibility to share with the interop source generators?

jkoritzinsky commented 1 year ago

We have a collection today that has this behavior, SequentialEqualImmutableArray<T>. We also had one for dictionary at one point, but dropped it as we were only using it in cases where some of the types would never be equal anyway. @Sergio0694 also has an implementation with a better API surface in the .NET Community Toolkit.

Sergio0694 commented 1 year ago

Yeah I've been usig my own EquatableArray<T> type (here) across all of my repositories (.NET Community Toolkit, ComputeSharp, PolySharp, a whole bunch of generators I wrote for the Microsoft Store, etc.). It's effectively one of the absolute must have building blocks when writing a new source generator. I will say — I love the idea of adding something like this to the BCL (provided it has an equivalent API surface), though along with that of course we'd have to make this available through some OOB package (eg. System.Collections.Immutable?) so that you'd automatically get this as well in every source generator project as a transitive dependency from Roslyn. That would be pretty nice.

Worth noting — that's just one of the many missing APIs you have to manually polyfill every single time you're writing a generator, but this one in particular feels like something should really just be built-in, as it's also perf-critical 😅

ericstj commented 2 months ago

Moving out all the incremental source gen work to 10.0

eiriktsarpalis commented 2 months ago

FWIW I've been using equatable sets and dictionaries with success in typeshape-csharp:

https://github.com/eiriktsarpalis/typeshape-csharp/tree/6a6c68c76829fb26b8aaec42b8a3bdff02018423/src/TypeShape.Roslyn/IncrementalTypes