Open Sergio0694 opened 3 years ago
The issue is that NormalizeWhitespace is quite expensive
I'd be interested to see if we could just fix that. Then everyone benefits. If not, then maybe specialized apis are needed.
I see this has been added to the 17.1 milestone but the API still needs some work. If Roslyn proposals work the same as runtime ones I assume that means we should just discuss that here and settle on an API shape that can then go to LDM review?
@CyrusNajmabadi is the blocker for you just the fact that you'd like to first investigate whether NormalizeWhitespace
can be sped up enough yet to avoid the need for this API? If so, have you had a chance to look into this so far? Additionally, even if that was possible (certainly making the syntax tree traaversal faster wouldn't hurt in general), do you think that this specific API wouldn't provide enough value on its own regardless, given that it would likely still be faster and more memory efficient due to not having to rebuild and allocate a whole deep copy of the syntax tree being rendered? π€
Just trying to get the conversation going here as I would love for this to make it into 17.1, is all π
is the blocker for you just the fact that you'd like to first investigate whether NormalizeWhitespace can be sped up enough yet to avoid the need for this API
Yes. Though I'm not volunteering for this :-)
... but would greatly improve the performance of their source generators ...
I haven't seen much evidence that this is true once you combine it with:
In other words, while NormalizeWhitespace has been measurably significant at times, I've found that a source generator deemed observably slow by users achieved that characteristic through other means. Even in the case of CsWin32 (the source generator with the most pronounced impact from NormalizeWhitespace), it would not have been fixed by this proposal and it has already mitigated the problem through other means.
Closing this issue as we've seen no reply to the request for more information. If you are able to get the requested information, please add it to the issue and we will retriage it.
Uh... This feels like it wasn't resolved yet, should we reopen it? π
@CyrusNajmabadi can you clarify what more info you were asking for?
Basically sam's point from above. Namely that this is actually an issue in practice when IGs are used properly. I feel like we need more information to indicate why this is the right place to be focusing API attention.
I remember another issue where it was discussed that ideally, Roslyn/VS would automatically call NormalizeWhitespace().ToFullString()
on all generated code, so that exceptions thrown in genertated code would have stack traces with reasonable and consistent line numbers.
Is that still on the table? If so, maybe it would make sense to implement that and then get telemetry from VS to see if this current proposal makes sense π
would automatically call
It is absolutely not in the table. C# is definitely whitespace sensitive, and doing this could def break code. We should not automatically do anything. Of you want normalized code, and that's ok for your domain, then just normalize it first.
I'm referencing https://github.com/dotnet/roslyn/issues/49685#issuecomment-738078844
@AArnott Would you mind weighing in on this? π
I have no problem with people making normalize whitespace fast. Seems like it people need better performance there then we should just get some prs out to improve it, versus creating whole new apis.
As I've mentioned in the past, normalize was never something anyone spent time optimizing. So it's likely it can just be sped up if someone was interested in looking into that.
I disagree with @sharwell that CsWin32's whitespace problem or perf problems are solved. We avoid NormalizeWhitespace completely now because it is unacceptably slow. Instead, we have our own FastSyntaxFactory
class and use the Banned API analyzer to forbid use of SyntaxFactor and a bunch of overloads on various syntax classes because they use elastic trivia which kicks in some automatic normalizing (I guess... my memory is fuzzy on this as we did it long ago). In addition to trying to get some reasonable whitespace (e.g. a space after a type name so it doesn't 'collide' with the next node or token) during generation itself, we have our custom whitespace 'visitor' class that fixes indentation and some other whitespace concerns.
The result is ugly, but readable. It also isn't fast. It's much faster than NormalizeWhitespace, but I just fixed a perf bug this morning where we spent 20 seconds adding space between members of a type: we had a loop where we changed a SyntaxList
within the loop instead of converting it to a mutable List first and then recreating the SyntaxList at the end.
The lengths we go to in order to have acceptable perf is far in excess of what I would consider reasonable, and the resulting generated text is sub-par at best.
I don't care if it's a new API or fixing an old one, but we should not have to go to lengths to get good perf.
To Sam's comment about incremental SGs, csWin32 doesn't yet use any incremental APIs. CsWin32 is fundamentally a non-incremental thing for the most part because one character typed in any code file could potentially change generated code. I expect we can find specific syntax sub-trees that can be cached and reused, but it will be very expensive and we haven't prioritized that yet, since our UX is ok for now. My primary concern about perf right now is for our automated testing (which resembles real build time for customers), where just one run of our 'full generation' test is now taking 12 minutes on a really fast machine. We have to run it under 12 different configurations. So needless to say, I'm very interested in improving perf in the non-incremental case.
@AArnott i agree with thsi:
The lengths we go to in order to have acceptable perf is far in excess of what I would consider reasonable
Currently, teh generation part of SG has zero funding. I've proposed options in the past about providing improvements and APIs here, but we haven't been able to fit them into the schedule at all because of all the other pressing matters. Generally speaking, i think this is because SGs are considered a highly advanced scenario, for niche areas. So the feeling is that the people using it are advanced enough to roll solutions, without us having to provide that ourselves. I'll push again for us to get some budget to improve things here. But i'm not certain anything will have changed in this area.
Would love to see a solution to this. 40% of our build time is spent on calls to NormalizeWhitespace (full disclosure - we are generating thousands of classes at build time, using incremental source generators).
Briefly playing with removing calls to NormalizeWhitespace and using trivia manually, then extrapolating the gains, we could see a reduction in build time from 3 minutes (on an MacBook M1 Max) to just over 1.5 minutes. Reduced memory usage was also observed (1GB vs 2GB). Unfortunately, manually generating trivia in EVERY syntax tree would be a huge pain and I'm not prepared to introduce all of that extra code to the generation process.
@jeff-simeon can you jsut emit the strings directly? no point in SGs in using nodes, as we're going to have to just parse things anyways.
I suppose we could - though it would make the code a lot harder to read, and it's relatively complex already.
This was also briefly mentioned on Discord as well β in many scenarios, using syntax nodes makes the code for generators much easier to understand, follow and maintain, especially when the generators get more complicated. This applies to several generators in the .NET SDK as well (cc. @jkoritzinsky).
Imho it'd be nice to have a more optimized version of this, on second thought a WriteToWithNormalizedWhitespace
would be even better, actually (or maybe both, since once uses the other). The entire tree duplication that NormalizeWhitespace
does is just completely unnecessary work.
@Sergio0694 - I agree and this is exactly the scenario we have - which is probably why @AArnott wrote a FastSyntaxFactory which wraps a StringBuilder and avoids the nodes.
That seems reasonable. Can you just use that factory instead?
@CyrusNajmabadi - I'm not sure what you're talking about. I'm referring to the previous post by @AArnott, where he mentions something that his team built. I do not have access to this and searching for it on the internet yields no results. I'm merely speculating as to what he did to work around the problem.
though it would make the code a lot harder to read, and it's relatively complex already.
I don't think ease of readability is a core goal of SGs. Basically, every time this has come up on the team (and i've brought it up a few times), the thinking is that this is all for computers to do, and writing fast string-writing code is acceptable.
If someone wants to make their own api to help here, they can, but it's not an investment area. I proposed options in the past to make things better, but it's all be rejected as there is no capacity on the team to take on new APIs to help out effectively expert scenarios.
I do not have access to this and searching for it on the internet yields no results. I'm merely speculating as to what he did to work around the problem.
It's on github:
You can fork/copy this. Note: this doesn't write to strings, it just avoids elastic trivia, which can be expensive if not needed.
Imho it'd be nice to have a more optimized version of this,
@Sergio0694 Nothing is stopping teams (like .net) that really want this from writing such a thing. Trees are just a simple data structure. And if it's hyper critical that trees be the form you use and you don't want to use trivia and you want to write it out to a string quickly, then writing your own abstraction to do that seems totally viable.
The cost assessment in the past was that this wasn't an area we were willing to invest in. It's hyperspecialized just to this narrow domain, and writing out strings is just something individual libraries can do themselves, or utilize a library that someone wants to invest in.
@Sergio0694 - I agree and this is exactly the scenario we have - which is probably why @AArnott wrote a FastSyntaxFactory which wraps a StringBuilder and avoids the nodes.
My FastSyntaxFactory doesn't do that at all. It merely avoids elastic trivia (IIRC) by explicitly specifying all tokens when forwarding all calls to the regular SyntaxFactory.
though it would make the code a lot harder to read, and it's relatively complex already.
I don't think ease of readability is a core goal of SGs. Basically, every time this has come up on the team (and i've brought it up a few times), the thinking is that this is all for computers to do, and writing fast string-writing code is acceptable.
If someone wants to make their own api to help here, they can, but it's not an investment area. I proposed options in the past to make things better, but it's all be rejected as there is no capacity on the team to take on new APIs to help out effectively expert scenarios.
Ok, thanks for the reply, but it's a bummer :(
We will have to build our own syntax node model and wrap a StringBuilder at some point.
I would avoid a syntax node model altogether. It's an unnecessary intermediary abstraction (and often expensive in memory to boot).
Using a StringBuilder directly would be a maintainability problem. We'll have to build our own model that wraps a StringBuilder....I suppose it will be just like a SyntaxNode model.......but have a fast ToString method on it.
At the end of the day, all abstractions are unnecessary and come at some performance cost. The complexity of what we're doing requires some abstraction from direct string manipulation. That will come at some performance cost. But preferably, not a cost that occupies 40% of our build time.
Using a StringBuilder directly would be a maintainability problem.
I would be very surprised by that. Writing out code is usually very simple. Here's an example of a helper type i whipped up (similar to lots of stuff we've done before) to make it super easy: https://gist.github.com/CyrusNajmabadi/edf9f6059d8fd2ced7a2086a8835be1d
Just write to this, and tell it when you're increasing or decreasing indentation (and def write helpers/extensions to do it for common cases, like blocks). And it takes care of the rest.
We'll have to build our own model that wraps a StringBuilder
Sure. But it basically does nothing but pass through the data, and just handle indentation so that everything is not left-flushed. Almost no work/overhead necessary at runtime, and very simple to use on the consumer side.
At the end of the day, all abstractions are unnecessary and come at some performance cost.
Sure. But syntax-nodes are a huge abstraction cost. Like huge. IIRC, they're around 5 times the size of the strings they represent. And they require substantial manipulation to get them formatted.
They're really good for dealing with user code that needs to be understood/manipulated. They're really not necessary for a space where the end goal is generating code.
Changed the benchmark in this issue to use a custom indented string writer, here's the benchmark:
Method | Mean | Error | StdDev | Ratio | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
SyntaxTrees | 586.810 us | 10.3462 us | 9.6778 us | 1.000 | 30.2734 | 7.8125 | 0.9766 | 183.17 KB | 1.00 |
IndentedWriter | 1.268 us | 0.0151 us | 0.0134 us | 0.002 | 1.1959 | - | - | 5.05 KB | 0.03 |
I think @CyrusNajmabadi has successfully converted me over to the "just write to an indented writer" gang π
@Sergio0694 that matches my expectations. Using syntax trees at all is just hugely unnecessary overhead. They're gargantuan in memory, and involve an enormous amount of pointer chasing.
They're also just not a great intermediary construct for the domain of just wanting to write out a linearized string.
They are great if you want to actually operate on them in a tree-structured fashion. But I don't think I've ever seen a generator that actually does that. Rather, all generators just write the content start to finish. Or they make subsections, but then concatenate then linearly. All these cases are much better served with just writing a string.
Personally, I think writing the string is much easier and clearer too, as long as you have the right writer-abstraction (like the one I linked to).
That is indeed compelling data.
FWIW, CsWin32 does emit node trees, and we modify them along the way. It would be a total rewrite (of a huge codebase) to switch to StringBuilder both because all our SyntaxFactory callers would have to change and because we'd essentially have to decide up front everything we need to do before starting to emit code because we wouldn't be able to emit nodes and modify them later as we do now. It might make a reasonable project to tackle by combining it with supporting incremental source generation, since doing that properly also needs a fairly substantial refactoring so that we can truly track all the inputs into the source generator.
ecause we wouldn't be able to emit nodes and modify them later as we do now.
You can def do this.
Note: as i've mentioned in discord to several people, a very reasonable thing to do is create the indented string chucks for those other nodes up front (and then cache them). Later, when building the full text, it is trivial to add an existing indented node to the final indented writer you are using.
--
I cannot emphasize enough how much of a better approach to SG writing this is. Syntax and trees are a bad abstraction here. They are enormously expensive, unweildy, and designed to be good for very different scenarios.
Personally, i've never once found an SG case that benefits from them. Contract that with string writing. It's blazing fast. Extremely simple. Very easy to compose. And trivial to get the formatting you want within a line, and the indentation we expect for normal C# or VB.
As we can see from the numbers, you should expect roughly two orders of magnitude improvement for CPU and memory, and all using an absolutely trivial type that is super easy to use.
SGs should do this.
Overview
A common pattern in source generators (from what I can tell) for people using Roslyn APIs to generate trees is as follows:
NormalizeWhitespace()
ToFullString()
context.AddSource("Foo.cs", SourceText.From(source, Encoding.UTF8));
with the resultingstring
This is extremely inefficient, specifically at points (2) and (3). The issue is that
NormalizeWhitespace
is quite expensive, and in this case it's also completely unnecessary: we don't want a new tree per se, we just need the finalstring
to have the code from the initial tree with normalized whitespaces. The problem is that thatNormalizeWhitespace
ends up taking the majority of the CPU time in this whole workflow, and also allocating unnecessary memory. The allocations aren't necessarily a problem, but the increased CPU time very much is, considering the performance critical nature of source generators to keep a smooth IDE experience.I've put together a benchmark to show how much
NormalizeWhitespace
can impact performance:Full benchmark source code (click to expand):
```csharp using BenchmarkDotNet.Attributes; using BenchmarkDotNet.Running; using Microsoft.CodeAnalysis; using Microsoft.CodeAnalysis.CSharp; using Microsoft.CodeAnalysis.CSharp.Syntax; using static Microsoft.CodeAnalysis.CSharp.SyntaxFactory; BenchmarkRunner.Run
You can see how calling
NormalizeWhitespace
alone increases the total CPU time by 5x in this sample benchmark.Proposal
The simplest API proposal to address this issue would be to add a
ToFullStringWithNormalizedWhitespaces
API:This would produce a
string
equivalent to callingNormalizeWhitespaces().ToFullString()
, but in an optimized manner, ie. without actually duplicating the whole tree and creating new nodes, but just adding the right whitespaces to the output as needed while traversing the tree. This would require very little code changes by consumers currently doingNormalizeWhitespaces().ToFullString()
, but would greatly improve the performance of their source generators. I could also see some specific APIs to directly create aSourceText
instance from a tree, with normalized whitespaces, which could then also skip creating thestring
entirely, but considering that CPU time is the biggest problem here and not the actual memory allocations, that seems to be less of a priority from what I can see. Ideally, this newToFullStringWithNormalizedWhitespaces()
API would have a performance as close to justToFullString()
as possible.