dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.96k stars 4.03k forks source link

Opening document causes source generators to re-run #71251

Open KirillOsenkov opened 9 months ago

KirillOsenkov commented 9 months ago

When a gesture like Go To Definition opens a new document in the workspace, this creates a new Solution snapshot containing the new text container backed by the editor buffer of the open document. In that new solution, compilations are not realized. As a result, the moment the compilation is requested, source generators run again.

We should probably have an optimization where if the only thing that changed between two solutions is the document was opened and the text of the open document the same as it was from the previous text loader, we should probably reuse all compilations instead of recomputing them from scratch.

Our situation was exacerbated by the following. We have noticed this issue after investigating high CPU usage with a source generator that writes a file to disk every time (even if the text doesn't change).

The project system notices the file has changed on disk and changes the current solution via the following stack:

    Microsoft.CodeAnalysis.Workspaces   Workspace.SetCurrentSolutionAsync Line 244
    Microsoft.CodeAnalysis.Workspaces   ProjectSystemProjectFactory.ApplyBatchChangeToWorkspaceMaybeAsync Line 257
    Microsoft.CodeAnalysis.Workspaces   ProjectSystemProjectFactory.ApplyBatchChangeToWorkspaceAsync Line 238
    Microsoft.CodeAnalysis.Workspaces   ProjectSystemProject.BatchingDocumentCollection.ProcessRegularFileChangesAsync Line 419
    Microsoft.CodeAnalysis.Workspaces   ProjectSystemProject.ProcessFileChangesAsync Line 1021

In that new solution, compilations have not been realized either. This causes the same generator to run again, writing the same files to disk again. It enters an infinite loop.

Perhaps we should have another optimization here, where we remember the old text of the file that changed on disk, and if the new text of the file on disk is the same, we don't do anything. This would have allowed us to break out of the infinite loop.

This is the logic that updates the solution after the files have been written to disk: https://github.com/dotnet/roslyn/blob/8c38000b3bb2fb64699633eb58e0d284cb3a0ed1/src/Workspaces/Core/Portable/Workspace/ProjectSystem/ProjectSystemProject.BatchingDocumentCollection.cs#L425

jasonmalinowski commented 9 months ago

Just echoing a private conversation here: we do need to create a new compilation (since the syntax tree needs to point to the new text), and we have some optimizations here to attempt to make a smarter text:

https://github.com/dotnet/roslyn/blob/d88c4d64fd7a026f14e1630ae8e7247ee77c21b4/src/Workspaces/Core/Portable/Workspace/Workspace_Editor.cs#L402-L417

Ideally that'd create a new tree but it'd be incremental and perhaps just reuse the underlying green nodes.

KirillOsenkov commented 9 months ago

The infinite loop requires that the generator writes to a file on disk that is a part of the solution, perhaps via AdditionalFiles.

KirillOsenkov commented 9 months ago

Standalone repro here: https://github.com/KirillOsenkov/PublicBugs/tree/master/AnalyzerHavoc

KirillOsenkov commented 9 months ago

It also just occurred to me that the workspace layer is doing file system watching here. Normally I'd thought the workspace abstracts away the file system and the host should be responsible for this kind of stuff? I'd expect since the project system is responsible for most file system watching the host should also take care of picking up changes to files from solution. Another instance of this I think is watching .dll files for changes. Not sure if this is fine in terms of layering and responsibilities.

jasonmalinowski commented 9 months ago

@KirillOsenkov: the base workspace abstracts away the file system. Atop that we have ProjectSystemProject that does do file watching, so we can reuse that logic in VS for Windows and VS Code. (This was the refactoring I think you often wanted for VS for Mac too.) Although that still doesn't call any actual file system watching APIs, but instead goes through an interface. So for VS for Windows, we use the VS file watching service; for VS Code we actually use LSP dynamic registration to send watch requests back to the LSP client since VS Code has cross-platform support we can use. (And there's a fallback to using .NET file watchers if that's unavailable.)