Closed CyrusNajmabadi closed 2 years ago
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.
Tagging @stephentoub @ViktorHofer for visibility. I'm happy to contribute the changes on the runtime side for this if that can help out.
We need to make sure that we not only keep compatibility for downlevel generators (so introduce another build variant of the out-of-band packaged generators that uses the new roslyn API version) and ensure that we don't break uplevel repos like aspnetcore by adopting a Roslyn version that is newer than what many of them use (as we turn on in-box generators by default, which causes build failures when the Roslyn version the upstream repo uses is too old).
Generally, we try to keep the version of Roslyn that the "new" generators build against to at newest the version that ships with the SDK that dotnet/arcade is using to ensure we won't cause compile errors upstack. https://github.com/dotnet/runtime/blob/042fdf43dedd58dc3e6cdc118df5a7198792a9e7/eng/Versions.props#L34-L39
For generators that don't ship outside of dotnet/runtime, we can update them as soon as we have a NuGet package (so that includes the EventSource generator and the XUnitWrapperGenerator)
Note: teh major offenders i'm seeing in traces are Json and Logging. So it woudl be great to be able to update these asap and get these into a VS release that has the new updated API.
Since logging ships out-of-band, we could add another build dimension to it now against the newer Roslyn API version since we can put it into a different path in the NuGet package based on Roslyn version.
JSON ships in-box too, so we can't update the in-box version without breaking upstream repos, but we could update the package version for downlevel consumers immediately if we wanted to.
Basically we can update every generator project that isn't in the following list in some manner now (either by introducing new projects for the out-of-band generators or updating the in-repo ones directly), and then we can update the ones in the list after we know we won't break upstream repos.
cc @joperezr who I believe is looking at source generator perf at the moment.
Tagging subscribers to this area: @dotnet/area-meta See info in area-owners.md if you want to be subscribed.
Author: | CyrusNajmabadi |
---|---|
Assignees: | - |
Labels: | `area-Meta`, `tenet-performance`, `untriaged`, `source-generator` |
Milestone: | - |
@stephentoub @jkoritzinsky
Basically we can update every generator project that isn't in the following list in some manner now (either by introducing new projects for the out-of-band generators or updating the in-repo ones directly), and then we can update the ones in the list after we know we won't break upstream repos.
There's a potential hybrid option we can take.
So what does most**
mean? Well, currently the API depends on some internal functionality in order to support a very unlikely, but useful full correctness
case. Specifically, the API will understand if you do the following:
File1:
global alias FooBarAttribute = System.Text.Json.Serialization.JsonSerializableAttribute;
File2:
[FooBar]
class Whatever { }
In other words, it knows that that global alias exists, and as such the json-generator should run on Whatever
. If you were willing to say that that was a currently unsupported scenario, we could port over the entirety of the new API except teh part that supports global aliases. Practically speaking, i think all users would still be fine. It would be up to you if this functionality break would be worth the massive speedup we're seeing (over 40x in roslyn itself).
To me, given just how bad the perf problem is here, my preference is that we take the perf gain for all the generators, and we accept a highly unlikely functionality break. Then, when you can finally move all generators over to the new api we'll have teh best of both worlds. High performance, and complete correctness.
Any thoughts from your perspective?
Here are my thoughts on this:
Based on that I would propose the following:
cc @eerhardt @jaredpar
It might make sense to change the roslyn4.0 folder in our packages in which the source generators reside to the actual roslyn version that we will depend on when shipping .NET 7. If I'm not mistaken, that would be roslyn4.4 with this change. Presumably the SDK falls back to use the roslyn3.11 targeting source generators if a Visual Studio version is used that doesn't support roslyn4.4.
The issue with this proposal is that projects building with the 6.0.x00
SDK will now see degraded performance when they use the 7.0.0
System.Text.Json nupkg. With the 6.0.x
System.Text.Json nupkg, they will use the roslyn4.0
version, which has incremental build support. When the project upgrades to the 7.0.0
nupkg, there is no roslyn4.0
built version, so they will fall back to roslyn3.11
and now all the incremental support we took advantage of in .NET 6 is removed.
It is pretty easy to get into this situation AFAICT. You just need to be on a VS 2022 version that doesn't include .NET 7 - so let's say VS 17.0
. You are using the 6.0.0
System.Text.Json nupkg, and NuGet Manager says a new version is out in the "Updates" tab, so you update to STJ 7.0.0
.
Agreed, that's a problem and needs to be solved, thanks for pointing that out. What makes this problematic is that we currently don't have a lifecycle policy for source generators when they ship inside packages. I just recently had this discussion with @eerhardt.
Visual Studio 2019, which only works with the roslyn3.11 source generators is supported until April 2029. Based on that, we would need to build and package our roslyn3.11 source generators for the next 6-7 years.
Visual Studio 2022 17.0 is supported until July 2023, 17.2 until January 2024 and the support date for 17.3 isn't yet announced. Because of these support dates, we would need to keep our roslyn4.0 targeted source generators around until at least .NET 9.
It's very likely that we will again depend on newer roslyn features going forward and with that build and package even more versions of our source generators. Packages will grow unbounded and the main branch hosts all different versions of the roslyn source generators.
I don't think there's a good solution to mitigate the increasing build complexity, package growth and ever expanding matrix of source generators. One option that we might want to discuss is to couple our package support policy with toolset versions. Just theoretically speaking, we could mark the System.Text.Json 7.0 package to not support Visual Studio 2019 or older SDKs that don't have a specific roslyn version available. I don't suggest that we should to that but wanted mention it.
We should upgrade non-shipping roslyn4.x source generators to leverage the new API when publicly released tooling supports them. My assumption is that this will be VS 17.4 but I could be mistaken.
This would be a pretty bitter pill @ViktorHofer . That's likely 6 months out. Can you clarify why we could not get this in for the 17.3 line?
@ViktorHofer @eerhardt any thoughts on: https://github.com/dotnet/runtime/issues/70754#issuecomment-1155946743
At least with the idea there, we could get performance benefits to users immediately, albeit with a potential (though hopefully unlikely) functionality regression. This would be a stop-gap for the immediate future until you could actually adopt this new API (which has all hte performance benefit, without any functionality regression).
I'm coming from this from a perspective of trying to address the extremely bad performance regressions we're seeing in VS, which we are already getting complaints on from users. Having to wait until 17.4 is something i'd really like to avoid.
This would be a pretty bitter pill @ViktorHofer . That's likely 6 months out. Can you clarify why we could not get this in for the 17.3 line?
Your change was merged into main and https://github.com/dotnet/roslyn/commits/release/dev17.4 but not https://github.com/dotnet/roslyn/commits/release/dev17.3. Based on that, I don't think that your change will be part of the roslyn that ships with 17.3. @jaredpar is my assumption correct? Would it be possible and make sense to backport @CyrusNajmabadi's change into dev17.3?
pulling in @vatsalyaagrawal . I'm def a little confused. I was under the impression that our main branch was 17.3, not 17.4. @vatsalyaagrawal can you clarify?
Yes, Roslyn main branch is currently pointing to 17.3.P3.
That's great to hear. That would mean that we can update our internal non-shipping source generators to make use of the new API when VS 17.3 P3 publicly releases.
@ViktorHofer Ok great. I think that's a much more reasonable time frame :)
That would mean that we can update our internal non-shipping source generators to make use of the new API when VS 17.3 P3 publicly releases.
Ok great. I think that's a much more reasonable time frame :)
To make sure there's no disconnect, @CyrusNajmabadi, I believe Viktor isn't talking about the two you actually care about today (json and logging). "internal non-shipping source generators" sounds to me like our generators that only support the building of runtime. @ViktorHofer, can you clarify?
Correct. Let me update my above proposal:
@ViktorHofer Can you clarify: in what VS release woudl customers get the important performance fixes for all .net generators? Right now, the perf pain is extremely significant, so we really need a plan/solution that can address this in the near term.
Thanks!
I didn't see an answer to @CyrusNajmabadi proposal. As I understand it, he's proposing adding code for a polyfill into runtime. Generators built against the latest sdk get all the benefits; generators built against older sdks get most of the benefits. If that's an accurate summary, that sounds good to me both from a perspective of helping all customers soon and from a perspective of keeping the diffs between our builds minimal. Is this feasible?
@stephentoub That is correct with the caveat that a currently supported case may temporarily not be supported. Specifically, the case where a user defines a global alias to the attribute the generator needs, and then references that aliases in the attributes they have on their types.
@ViktorHofer Can you clarify: in what VS release woudl customers get the important performance fixes for all .net generators? Right now, the perf pain is extremely significant, so we really need a plan/solution that can address this in the near term.
If we follow my above proposal, that would be .NET 7 Preview 7.
I didn't see an answer to @CyrusNajmabadi proposal. As I understand it, he's proposing adding code for a polyfill into runtime. Generators built against the latest sdk get all the benefits; generators built against older sdks get most of the benefits. If that's an accurate summary, that sounds good to me both from a perspective of helping all customers soon and from a perspective of keeping the diffs between our builds minimal. Is this feasible?
If I understand @CyrusNajmabadi's proposal correctly, this would just be temporary until all source generators can target the new API and we would then remove the polyfill. Is that right? If so, would we able to do so in time for Preview 6 which I believe closes next week?
If we follow my above proposal, that would be .NET 7 Preview 7.
For all customers? Above you wrote "These would co-exist with the already shipping roslyn3.11 and roslyn4.0 source generators in the package"...who would still be using those?
If we follow my above proposal, that would be .NET 7 Preview 7.
Sorry, i'm asking which VS release, not which .net release. My strong preference is that we have some perf fix in place for 17.3.
who would still be using those?
Anyone using older versions of Visual Studio but the very latest packages:
Sorry, i'm asking which VS release, not which .net release. My strong preference is that we have some perf fix in place for 17.3.
.NET and Visual Studio don't release in sync and afaik we haven't yet publicly announced when VS 17.3 releases. I'm following-up on dates offline with @CyrusNajmabadi.
In general, we can update anything out-of-band faster than anything inbox because we just add another variant of the source generators there without requiring customers to update their toolset.
Anyone using older versions of Visual Studio but the very latest packages:
What about .NET 6 apps targeting the .NET 6 SDK?
As an example: https://devdiv.visualstudio.com/DevDiv/_workitems/edit/1555650 is one of many customer reported cases of this. In this case basic IDE features are being delayed by 10s of seconds, and the servicehub process is churning through CPU and tons of ram. In this case 30+GB just from the jsongenerator and logging generator.
Note that these are features which we generally want to run in 100s of ms worst case. So we're several orders of magnitude impacted here. This also other OOP semantic operations (like diagnostics and whatnot).
Blocked for adopting the official API on https://github.com/dotnet/roslyn/issues/63318
We do still intend to make this change for 7.0, even though it will likely miss RC1 given that we are currently blocked on that Roslyn bug getting fixed. Once it is, I will go ahead and make this change. Re-adjusting the milestone appropriately.
Roslyn bug was fixed over a week ago. Are we still waiting on the flow?
If we update the generators to use the new Roslyn here, we need to update all consuming repositories as well.
I think the best solution will be to update the SDK in Arcade to use an SDK with the fix so we don’t need to manually upgrade the Roslyn version used in each repo.
I think the best solution will be to update the SDK in Arcade to use an SDK with the fix so we don’t need to manually upgrade the Roslyn version used in each repo.
Who owns doing that -- is that you @jkoritzinsky , @joperezr or someone else?
I'm planning to take care of it this week.
@joperezr Anything left to do with this item? Or can we close it out?
PR isn't merged yet 😄. PR should be ready to go but I'm still waiting on making sure that all upstack repos have ingested an arcade update that will bump their compiler and SDK versions so that they'll be able to ingest the new source generators.
@joperezr Anything left to do with this item? Or can we close it out?
This can be closed now.
Thanks!
Roslyn recently added a new API for greatly speeding up SGs that use the pattern of looking for nodes that have an attribute on it with a particular fully qualified name for the attribute. For example, the json serialization generator looks for this attribute: https://github.com/dotnet/runtime/blob/978df67ced885aeca5f7e75379451bc1a57cc219/src/libraries/System.Text.Json/gen/JsonSourceGenerator.Parser.cs#L46
Currently, this is extremely expensive as on every edit, the generator must hit every attributed class in every file and semantically check if it has an attribute that binds to this fully qualified name. This may mean thousands of files bound, with expensive semantic models and binding checks performed, all for nodes+attributes completely unrelated to this SG.
The added API provides a new entry point to get the same functionality as previously possible, but with much better performance. Now, the generator says what attributes they care about (passing in teh fully qualified metadata name) and roslyn can smartly examine the source code and not even bother doing any binding or semantics when it would be pointless. Intuitively, you can think of it that when roslyn sees
[FooBar]
it goes "well, FooBar (and FooBarAttribute) could not ever bind to JsonSerializableAttribute, so it's pointless to even try". This massively reduces the amount of work done (>40x in roslyn itself) especially when the attribute is never, or almost never, used.Once this api is available in a consumable roslyn package, we should update all the runtime's SGs to use it. That includes the generators for
Regex. Done in https://github.com/dotnet/runtime/pull/70911. This also added the main polyfill code.Json. Done in https://github.com/dotnet/runtime/pull/71653Logging. Done in https://github.com/dotnet/runtime/pull/71651eventsource. Done in https://github.com/dotnet/runtime/pull/71662