DevTeam / Pure.DI

Pure DI for .NET
MIT License
410 stars 21 forks source link

Performance of the source generator. #39

Open lsoft opened 4 months ago

lsoft commented 4 months ago

Hello!

I'm not sure about these topics, so I post them here only fyi.

  1. If I understand the things right, then https://github.com/DevTeam/Pure.DI/blob/master/src/Pure.DI/SourceGenerator.cs#L27 means you are collecting all syntax nodes inside a whole source tree of the project (solution?). Looks like it will have scaling problems. What are performance of this approach? Is there any performance benefits in comparison with old ISourceGenerator?
  2. Again, If I not missed something, pushing GeneratorSyntaxContext into generator pipeline https://github.com/DevTeam/Pure.DI/blob/master/src/Pure.DI/SourceGenerator.cs#L28 renders the generator to be completely non incremental. Additional context can be found here: https://github.com/zompinc/sync-method-generator/issues/20

As an author of similar library https://github.com/lsoft/DpDtInject I found no way to effiently apply incremental source generators for the task of DI. If I wrong, could you share where did I miss?

Good luck! Thanks!

NikolayPianikov commented 4 months ago

Unfortunately you may be right.

About true in the filter, I had plans to figure out how to separate only meaningful changes, but for now it's just the way it is. As for syntaxContext - I need both a syntax node and a semantic model.

And I'll play with changes in IDE and try to understand if there is any profit from incremental API.

PS: You are doing an interesting project!

lsoft commented 4 months ago

@NikolayPianikov you may benefit from incrementability test (see the reference above) and from new articles of Andrew Lock (this is about pipeline building and what to allowed to inject in).

figure out how to separate only meaningful changes

please pay attention that syntax provider reevaluated all filtered nodes even if only one node has changed: however will still be run again for the two methods https://github.com/dotnet/roslyn/blob/main/docs/features/incremental-generators.md

In general, I asked Rolsyn members about our topic (DI) in conjunction with source generators, and the answer is not good: looks like it is impossible to implement source-gen'd DI efficiently. It is the reason I stopped my project, I see no future here :\

Good luck!

ekalchev commented 4 months ago

@lsoft

It is the reason I stopped my project, I see no future here :\

Whoa!! How bad is the performance? Are we talking about visual studio freezing on file changes or slow build times? I am planning to migrate to source generator from Unity DI. I never thought this could be a problem?

lsoft commented 4 months ago

@ekalchev

How bad is the performance?

I did not estimate.

about visual studio freezing

yeah, we are talking about IDE experience only. It is the main sufferer. Please consider:

  1. source generators starts with every key press in IDE.
  2. Time spent in old source generators (and, PROBABLY, in this incremental source generator) increases like O(N), where N is the size of your project.

It does not matters for a small projects, the performance is good. But for very big codebases (for example Rolsyn-scale) these source generators are able to slow down you IDE experience to the level of seconds, if I remember the numbers correctly.

It is the reason I positioned my Dpdt to the library authors and choose to inject source code inside the user project. Libraries are smaller than applications.

Sorry for these news. Remember, I may be wrong!

ekalchev commented 4 months ago

Yeah, I am considering porting very large codebase. If I understand correctly the source generators, this issue should be present only for IIncrementalGenerators as they are evaluated on key press. Other type of generators should be run only during build and this won't be a problem.

Also our solution is over 250+ projects. I believe if I keep incremental source generators only in .exe projects that should not affect the code experience in other projects? Am I right?

lsoft commented 4 months ago

@ekalchev

Other type of generators should be run only during build

AFAIK there are 2 types of SG: incremental and the old ISourceGenerator. The latter have same weakness, they starts for every key press.

I believe if I keep incremental source generators only in .exe projects that should not affect the code experience in other projects? Am I right?

Sorry, I don't know. Consider organize an experiment.


In general, I'm sure everyone agrees that no one need to rebuild all DI trees for every key press. I guess the more correct way is to rebuild DI tree in a pre build phase somehow... But this have disadvantages and I chose to stop the development...

NikolayPianikov commented 4 months ago

The first version of Pure.DI was written using a non-incremental API. And I haven't noticed at least that the new API has become worse.

About performance. I have some public projects that contain quite large and complex compositions of objects, like this. And in the IDE I don't feel any problems at all. Private projects I can't show you. I periodically collect statistics on large compositions and found that CPU is wasted on:

The rest of the code runs fast. Besides, CancalationToken is passing everywhere, it allows you to interrupt further code execution and it is very convenient.

I have test for large configs and in general the generator shows itself well.

I don't think that with the current Roslyn API it is possible to reuse metadata from previous syntax trees in any way. The main place of optimization for me is to limit the set of trees to be analized and I plan to do that

NikolayPianikov commented 4 months ago

new articles of Andrew Lock (this is about pipeline building and what to allowed to inject in).

@ekalchev thx for this, will definitely read it.

ekalchev commented 3 months ago

The topic discussed here, bothers me, because I need to make a decision to transition huge solution 250+ csproj files from Unity DI to Pure.DI and I don't want to make the life of many developers miserable by degrading the performance of Visual Studio. I see some concerns about the performance of source generators here https://github.com/dotnet/roslyn/issues/55518 and it seems this is really a problem with source generators.

I wonder if Pure.DI really needs to generate code on each keystroke in Visual Studio and if it doesn't can I find a way to turn it off? This kind of source generator doesn't need to generate classes on the fly so you can call their interface immediately after typing something that triggers the source generation - you don't need that. I wonder if it is possible to avoid having Pure.DI generator included in MSBuild inline script completely and only add it inside a MSBuild Target before Build. This way the code will be generated only during build and not on a keystroke

<Target Name="GenerateSourceWithPureDI" BeforeTarget="CoreCompile">
     <!-- Somehow add Pure.DI here so it doesn't generate source while typing -->
</Target>

Alternatively, is it possible to have a MSBuild variable similar to DisableFrameworkReferenceAnalyzers that turns off Pure.DI generator and have this variable set for debug builds for example?

NikolayPianikov commented 3 months ago

@ekalchev

I have looked for performance and found no significant performance issues. Yes Roslyn is not fast with the semantic model. Unfortunately generators cannot distinguish between types of changes. I personally don't experience problems in large projects because generators work in the background and don't slow down the process of writing code. Moreover, if the code is outdated, the generation is canceled at any stage and a new one is started. But you can just turn the generator on and off like this:

<Project Sdk="Microsoft.NET.Sdk">

    <PropertyGroup>
        <OutputType>Exe</OutputType>
        <TargetFramework>net8.0</TargetFramework>
        <LangVersion>latest</LangVersion>
        <ImplicitUsings>enable</ImplicitUsings>
        <Nullable>enable</Nullable>
        <Generator>False</Generator>
    </PropertyGroup>

    <ItemGroup>
        <PackageReference Include="Pure.DI" Version="2.1.9" Condition="'$(Generator)'=='True'">
            <PrivateAssets>all</PrivateAssets>
            <IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
        </PackageReference>
    </ItemGroup>

</Project>

The MSBUILD variable "Generator" will allow you to disable and enable the generator from the MSBUILD variable.

You can do this for multiple projects at once if you place this logic in the Directory.Build.props file

If it would be more convenient I can add a special flag to enable and disable generation.

lsoft commented 3 months ago

Switching to msbuild stage has huge drawbacks and (from my pov) is not an option. one will no access to roslyn metadata inside msbuild step, so it will be necessary to rebuild such model again, which is horrible for compilation timings.

I proposed idea somewhere inside Roslyn to share Roslyn metadata across msbuild steps. That's a great idea, I think, because projects like this (and mine) will generate its code only once, at the start of compilation. But, Roslyn team is small and 100% its time is spent for every year race of new C#. No hope))

You can generate a huge fake solution and benchmark its performance in the scope of Pure.DI. In other hand, if your project suffers from poor DI-performance, I guess honestly, that this problem should be solved in some different way instead of switching to ISG-based DI.

ekalchev commented 3 months ago

In other hand, if your project suffers from poor DI-performance, I guess honestly, that this problem should be solved in some different way instead of switching to ISG-based DI.

My main motivation for source generated DI is that reflection based DI, require loading all dotnet assemblies at process start so the code can discover the constructor parameters. This may be ok for web apps but it is huge drawback for Desktop app where you need fast startup and delay the load of some assemblies or not load them all at if they are not needed in some scenarios. I am talking about hundreds of assemblies. I believe source generated DI will solve this problem, because the generated factories will load the assembly only when the type from this assembly is needed. Add to this reflection code on each type constructor and I believe I can half the startup time of my application and stop the degrading startup with adding more and more types and assemblies in the future. Why do think this is not valid reason to switch to source generated DI?

If it would be more convenient I can add a special flag to enable and disable generation.

I didn't know there was an option to turn it off. I think this will be enough to find a MSBuild solution if there are performance issues

lsoft commented 3 months ago

I can half the startup time

is that a real data or theoretical estimation? I'm just curious.

did you try Native AOT deployment?

ekalchev commented 3 months ago

I can half the startup time

is that a real data or theoretical estimation? I'm just curious.

did you try Native AOT deployment?

it is estimation based on profiling data. We can't use NativeAOT with WPF. Even if we migrate to different UI framework, reflection based DI will still be a showstopper for NativeAOT

NikolayPianikov commented 3 months ago

@ekalchev if you define the parameters of your big solution, I can create a test solution and test its performance with the code generator. By parameters I mean: