Open ghost opened 3 years ago
I have been investigating dotnet format
slowness in our large solution. Here is what I've found out so far:
dotnet format whitespace --folder
is much faster assuming you are using .Net 6 (with the new version of dotnet format
) and are only interested in whitespace
formatting. If you need style
and analyzers
, things get much slower due to compilation.dotnet format
under a specific project instead of the whole solution helps. For example, dotnet format Solution.sln --include SomeProject/SomeFile.cs
is much slower than dotnet format SomeProject/SomeProject.csproj --include SomeProject/SomeFile.cs
for some reason.
Of course this workaround is only useful if you are doing a git diff
to format only affected files in a branch. I wrote a simple bash script to group modified files by project and run dotnet format
concurrently for each affected project with --include <path>
option. It helped reduce the total runtime from more than 2 mins to under 1 min in our solution assuming a couple of files are modified under a few different projects. We do need style
and analyzers
so can't use the fast whitespace
command with the --folder
option.I suspect redundant compilation (due to referenced projects getting built/analyzed multiple times) is the root cause for the slowness but I am not sure yet. Needs more debugging
CSharpSimplifyTypeNamesDiagnosticAnalyzer
is responsible for about 20 seconds of runtime for one of our projects. This is what I got after running analyzers separately with extra logging:
[01/05/2022 02:37:26] Running 7 analyzers on Kaggle.XXX
[01/05/2022 02:37:26] Running analyzer MisplacedUsingDirectivesDiagnosticAnalyzer on Kaggle.XXX
[01/05/2022 02:37:27] Finished running analyzer MisplacedUsingDirectivesDiagnosticAnalyzer on Kaggle.XXX in 914ms
[01/05/2022 02:37:27] Running analyzer CSharpUnboundIdentifiersDiagnosticAnalyzer on Kaggle.XXX
[01/05/2022 02:37:28] Finished running analyzer CSharpUnboundIdentifiersDiagnosticAnalyzer on Kaggle.XXX in 653ms
[01/05/2022 02:37:28] Running analyzer CSharpUseExplicitTypeDiagnosticAnalyzer on Kaggle.XXX
[01/05/2022 02:37:29] Finished running analyzer CSharpUseExplicitTypeDiagnosticAnalyzer on Kaggle.XXX in 990ms
[01/05/2022 02:37:29] Running analyzer CSharpNamingStyleDiagnosticAnalyzer on Kaggle.XXX
[01/05/2022 02:37:30] Finished running analyzer CSharpNamingStyleDiagnosticAnalyzer on Kaggle.XXX in 868ms
[01/05/2022 02:37:30] Running analyzer CSharpAddBracesDiagnosticAnalyzer on Kaggle.XXX
[01/05/2022 02:37:31] Finished running analyzer CSharpAddBracesDiagnosticAnalyzer on Kaggle.XXX in 764ms
[01/05/2022 02:37:31] Running analyzer CSharpSimplifyTypeNamesDiagnosticAnalyzer on Kaggle.XXX
[01/05/2022 02:37:50] Finished running analyzer CSharpSimplifyTypeNamesDiagnosticAnalyzer on Kaggle.XXX in 19688ms
[01/05/2022 02:37:50] Running analyzer CSharpPreferFrameworkTypeDiagnosticAnalyzer on Kaggle.XXX
[01/05/2022 02:37:51] Finished running analyzer CSharpPreferFrameworkTypeDiagnosticAnalyzer on Kaggle.XXX in 477ms
[01/05/2022 02:37:51] Finished running 7 analyzers on Kaggle.XXX in 24407ms
Turns out there are known performance issues with this analyzer: https://github.com/dotnet/roslyn/issues/18950
The runtime seems to be proportional to number of files (or number of lines of code) in the project. We have several projects with over 300 files and this analyzer takes at least 5 seconds to run for each of these projects.
Another thing I noticed is that NamingStyleDiagnosticAnalyzer
always runs for all projects even if they are not included with --include
. For example, dotnet format --include SomeProject/SomeFile.cs
will run the NamingStyleDiagnosticAnalyzer
in SomeOtherProject
and just discard the result as no files are specified under SomeOtherProject
.
Filtering out the analyzer based on formattablePaths
before the following line fixes the issue: https://github.com/dotnet/format/blob/e473b1d1eb3d0a932501fad472219dca4d9c6576/src/Analyzers/AnalyzerFormatter.cs#L303
GetSeverityAsync
does have a formattablePaths
check but that's after the line above so NamingStyleDiagnosticAnalyzer
is always included.
This fixed's reduced the runtime of dotnet format --include SomeProject/SomeFile.cs
by 50% in our solution (from more than 2 minutes to 1 minute). I will prepare a PR
Here is the PR to fix the issue with redundant runs of NamingStyleDiagnosticAnalyzer
: https://github.com/dotnet/format/pull/1485
- I wrote a simple bash script to group modified files by project and run
dotnet format
concurrently for each affected project with--include <path>
option.
Hey @erdalsivri , could you perhaps share this script? Thanks!
I think it's a good idea to link this to #1479 because both need to be fix to make this tool actually usable in pre commit hooks
Just wanted to share some comments from #1479
Two things go into why Style and Analyzer operations are slower.
First, they require that we have correct semantic information about your code. To do this we rely on MSBuild performing a design-time build where we can know the exact set of code files as well as what references to include with each project. This is equivalent to what Visual Studio or OmniSharp does when opening a project/solution. It is less costly than a full compile, but it can take anywhere from seconds to minutes depending on the size of your solution.
Second, we have to actually run the analyzers and apply code fixes. Running analyzers requires additional in-memory compilations. The results of one analysis become invalid as we apply code fixes from earlier analysis.
We have discussed alternate ways to speed up populating our workspace. Ultimately MSBuild will be involved at some point because .NET builds are so highly customizable the only way not to be wrong is to have MSBuild do the work.
We are not committing to either approach today, but know that this a problem that we are thinking on.
@JoeRobich could you please share what is the current state of this effort?
I just ran dotnet format
on a unit tests project with 10 files which is a part of a large repository.
Loading workspace took ridiculous amount of time:
Complete in 1045066ms.
Determining formattable files.
What's puzzling is that it seems to figure out via MSBuild the whole dependency chain of projects to figure out their editorconfig settings and then simply discards them as they are not in scope.
.editorconfig
files or just using user-specified .editorconfig
without loading the whole project in memory?dotnet-format
as a default out-of-the-box formatter - this needs to be prioritized, funded and fixed.Happy to share my binlog if it helps.
Not to mention that dotnet format
process consumes 25GB of RAM while running.
I don't think any teams from Microsoft are using this, because if they were they would find it unacceptable.
I can confirm for my relatively small project dotnet format whitespace
runs 2x as fast as dotnet format
. Sadly, it still runs in 4 seconds even with only whitespace
, which is still a noticeable slow down in development flow compared to something like Ruff format from the Python ecosystem which runs in imperceivable time.
I had the following set up a File Watcher in Rider:
dotnet format whitespace $SolutionPath$ --include $FileRelativePath$
I was hoping that having it only format the single file I was working on would speed things up, but as mentioned above it seems like a full project build/analysis of some kind is involved.
Hi,
We are facing the same issue on running dotnet format
on a specific project containing dependencies which are in the workspace. The command seems to "load" everything (all the dependency tree) and take hours to finish 😴
It outputs
Duplicate source file xxx in project yyy
for all dependencies then
Project xxx is using configuration from dir
for same
Is there a reason for the command to "load" the dependencies even with --no-restore
and launched inside a project folder?
As a workaround, we tried :
--include
only files as describe in the readme ❌--folder
parameter ❌But in the end I think we will just create a fake ".csproj" file without dependencies to force formatting only in the directory where the command is run 🤷🏻it seems to work and only lint files in the directories under.
@johnthagen Does using the --folder
flag help? This will skip performing a design time build and is only available when using the whitespace
subcommand.
dotnet format whitespace $SolutionPath$ --folder --include $FileRelativePath$
Though that might help (which is nice), I think its been discussed here or in linked issues that this is not a proper solution. The analyzers/code style part of dotnet format is most useful, which is the part that is slow. Also, I believe it didnt work well with --include, which is very important for git hooks. (See linked issues)
I would not recommend people to use dotnet format whitespace. Instead, I switched to Csharpier for my whitespace and line breaks, which is faster and better compared to dotnet format.
Perhaps learnings could be taken from that project.
@JoeRobich --folder
does improve the situation further
dotnet format Project.sln --include .\src\Project\Main.cs
# 7.2 seconds
dotnet format whitespace Project.sln --include .\src\Project\Main.cs
# 2.5 seconds
dotnet format whitespace . --folder --include .\src\Project\Main.cs
# 1.5 seconds
Could someone explain what exactly is lost by using --folder
as it pertains to formatting the code? Are there drawbacks to this?
Could someone explain what exactly is lost by using
--folder
as it pertains to formatting the code? Are there drawbacks to this?
When using the --folder
option we are just looking at the source files beneath the specified folder path. We do not have any information about the solution or projects contained within the folder. We do not know which source files are actually part of the compilation, we do not know whether there are other projects being referenced, or which packages you might be referencing.
For whitespace formatting this is all okay. We don't need to know what code means in the same way running the code style or 3rd party analyzers would in order to add spaces and newlines.
@JoeRobich Thanks for the explanation.
Is there a reason --folder
isn't the default for dotnet format whitespace
? It seems like that would be better for (all?) use cases of the whitespace
command?
The downside is this part We do not know which source files are actually part of the compilation
. Meaning running dotnet format
and dotnet format whitespace
could work against two separate set of files.
For instance, the dotnet/format repo has test assets which include intentionally unformatted source. In our project file these files are removed from the compilation and would not be formatted during dotnet format whitespace
, but would be formatted when running dotnet format whitespace --folder
.
We could look at adding documentation around this and logging a link within a message when running the whitespace
subcommand to help users discover this option.
For my particular use case, the issue seems to be (from poking around OpenSolutionAsync
) that the solution is loaded 1 project at a time, sequentially: the code loops over each project and awaits
loading each one. This takes approx 90 seconds (for ~ 40 projects)
Is there any way project loading could be done in parallel? I guess this would be a change to Roslyn?
If I build my entire solution from scratch from the command line, it finishes in ~ 25 seconds. Loading the solution in Visual Studio only takes a few seconds (albeit there are all the continuing background tasks that confuse timings somewhat)
(I would have poked around a bit further, but loading a project ultimately boils down to an RPC call)
Or better alternative use csharpier, surprisingly, i'm not even sure why you guy still stick with this tool, it's slow, it's bad at format, it cannot do a proper function parameter alignment, even resharper is better at the job (which also being cock block by jetbrains)
i'm not even sure why you guy still stick with this tool,
I can't speak for everyone, but there is value in a tool that is installed by default with the dotnet
CLI, and a first party tool is generally more universally used than a third party. That said, I will probably try out CSharpier soon since in my experience opinionated formatters save so much time (dotnet format
is not very strict compared to things like Ruff format or Prettier).
Same there, it is way too slow and even if I find csharpier insanely great it doesn't support analyzers but there should be something in between (js did it quite well, I don't see why c# can't, maybe using a post build phase instead of plugging into the build directly and just reusing loaded model instead of building twice).
on our repository (https://github.com/xSke/PluralKit), dotnet-format will take between 4 and 8 seconds to run, even when no changes need to be applied.
I saw #757 but I'm not sure if it's related, thus the new issue.
here's a
-v diag
logging of a particularly bad run: