Add RegisterCompilationUnitStartAction to allow analyzers to immediately bail out from analyzing a file prior to processing nodes.

CyrusNajmabadi commented 8 months ago

Today an analyzer that wants to analyze syntax nodes needs to call RegisterSyntaxNodeAction(callback, SyntaxKind.xxx). The registered callback will then be called on every node with that kind in a particular file.

However, analyzers then need to often check if their particular options would disable their processing for a particular file. Given that there may be a large number of nodes of a particular kind in a file, this means calling back both on all those nodes, and then having all those nodes have to check the same option over and over again to see if it's disabled.

It would be good if we could avoid all that work by giving analyzers a point to first check if they are disabled for a particular file. We can accomplish that with a pattern we have for the other types of actions:

public abstract class AnalysisContext
{
+    /// <summary>
+    /// Register an action to be executed at semantic model start. A semantic model start action can register other
+    /// actions and/or collect state information to be used in diagnostic analysis, but cannot itself report any
+    /// <see cref="Diagnostic"/>s.
+    /// </summary>
+    /// <param name="action">Action to be executed at compilation start.</param>
+    public abstract void RegisterCompilationUnitStartAction(Action<CompilationUnitStartAnalysisContext> action);
}

+public abstract class CompilationUnitStartAnalysisContext
+{
+    /// <summary>Compilation unit that is the subject of analysis</summary>
+    public SyntaxTree CompilationUnit { get; }
+    /// <inheritdoc cref="CompilationStartAnalysisContext.Compilation"/>
+    public Compilation Compilation => _semanticModel.Compilation;
+    /// <inheritdoc cref="CompilationStartAnalysisContext.Options"/>
+    public AnalyzerOptions Options => _options;
+    /// <inheritdoc cref="CompilationStartAnalysisContext.CancellationToken"/>
+    public CancellationToken CancellationToken => _cancellationToken;

+    /// <inheritdoc cref="CompilationStartAnalysisContext.RegisterCodeBlockStartAction{TLanguageKindEnum}(Action{CodeBlockStartAnalysisContext{TLanguageKindEnum}})"/>
+    public abstract void RegisterCodeBlockStartAction<TLanguageKindEnum>(Action<CodeBlockStartAnalysisContext<TLanguageKindEnum>> action) where TLanguageKindEnum : struct;

+    /// <inheritdoc cref="CompilationStartAnalysisContext.RegisterCodeBlockAction"/>
+    public abstract void RegisterCodeBlockAction(Action<CodeBlockAnalysisContext> action);

+    /// <inheritdoc cref="CompilationStartAnalysisContext.RegisterOperationBlockStartAction"/>
+    public virtual void RegisterOperationBlockStartAction(Action<OperationBlockStartAnalysisContext> action);

+    /// <inheritdoc cref="CompilationStartAnalysisContext.RegisterOperationBlockAction"/>
+    public virtual void RegisterOperationBlockAction(Action<OperationBlockAnalysisContext> action);

+    /// <inheritdoc cref="CompilationStartAnalysisContext.RegisterSyntaxNodeAction{TLanguageKindEnum}(Action{SyntaxNodeAnalysisContext}, TLanguageKindEnum[])"/>
+    public void RegisterSyntaxNodeAction<TLanguageKindEnum>(Action<SyntaxNodeAnalysisContext> action, params TLanguageKindEnum[] syntaxKinds) where TLanguageKindEnum : struct;

+    /// <inheritdoc cref="CompilationStartAnalysisContext.RegisterSyntaxNodeAction{TLanguageKindEnum}(Action{SyntaxNodeAnalysisContext}, ImmutableArray{TLanguageKindEnum})"/>
+    public abstract void RegisterSyntaxNodeAction<TLanguageKindEnum>(Action<SyntaxNodeAnalysisContext> action, ImmutableArray<TLanguageKindEnum> syntaxKinds) where TLanguageKindEnum : struct;

+    /// <inheritdoc cref="CompilationStartAnalysisContext.RegisterOperationAction(Action{OperationAnalysisContext}, OperationKind[])"/>
+    public void RegisterOperationAction(Action<OperationAnalysisContext> action, params OperationKind[] operationKinds);

+    /// <inheritdoc cref="CompilationStartAnalysisContext.RegisterSyntaxNodeAction{TLanguageKindEnum}(Action{SyntaxNodeAnalysisContext}, ImmutableArray{TLanguageKindEnum})"/>
+    public virtual void RegisterOperationAction(Action<OperationAnalysisContext> action, ImmutableArray<OperationKind> operationKinds);
}

public abstract class CompilationStartAnalysisContext
{
+    public abstract void RegisterCompilationUnitStartAction(Action<CompilationUnitStartAnalysisContext> action);
}

This would then be used by our own analyzers like so:

    protected override void InitializeWorker(AnalysisContext context)
        => context.RegisterCompilationUnitStartAction(context =>
        {
            var option = context.GetCSharpAnalyzerOptions().FeatureXXXOption;
            if (ShouldSkipAnalysis(context, option))
                return;

             context.RegisterSyntaxNodeAction(ProcessSyntaxNode, SyntaxKind.xxx);
        });

// Or, also commonly:

    protected override void InitializeWorker(AnalysisContext context)
        => context.RegisterCompilationStartAction(context =>
        {
            // Feature not available before C#8
            if (context.Compilation.LanguageVersion() < LanguageVersion.CSharp8)
                return;

            context.RegisterCompilationUnitStartAction(context =>
            {
                var option = context.GetCSharpAnalyzerOptions().FeatureXXXOption;
                if (ShouldSkipAnalysis(context, option))
                    return;

                 context.RegisterSyntaxNodeAction(ProcessSyntaxNode, SyntaxKind.xxx);
             });
        });

With this, we'll now only get the syntax node callbacks for files that we should actually be scanning.

This would now be the optimal way to register a feature to work on nodes, while knowing it would have virtually no runtime impact if hte langversion were too low, or if the file was excluded from analysis.

CyrusNajmabadi commented 8 months ago

@mavasani does this make sense to you?

mavasani commented 8 months ago

Thanks @CyrusNajmabadi, this sounds good. We may want to first take this through the API review meeting to identify if we want RegisterSyntaxNodeStartAction, RegisterSyntaxTreeStartAction (or should it be RegisterSemanticModelStartAction) or both. Basically, each syntax node or tree (or model) will now allowed to register nested actions for nodes within that callback node or tree (or model) respectively. One thing to that needs to be discussed for the design - do we also allow registering nested IOperation actions, code block actions, operation block actions, etc.? Basically, analyzing syntax nodes within a tree is one aspect that this issue has been filed for, but we likely want to design this API to account for all code constructs that can be analyzed within a node/tree. Once we have an API approval, I'll sync up with @arkalyanms to schedule this work.

CyrusNajmabadi commented 8 months ago

We may want to first take this through the API review meeting to identify if we want RegisterSyntaxNodeStartAction, RegisterSyntaxTreeStartAction (or should it be RegisterSemanticModelStartAction) or both.

I think we want most forms. So, for example, RegisterCompilationStartAction would give you a context you could then RegisterSyntaxTreeStartAction on.

This would allow you to first do a compilation check. Bailing out, for example, if the lang version was too low. Then, you'd register a syntaxtreestartaction which would allow you to check at the per-file basis if you wanted to dive deeper.

Note: i strongly think we should then have an extension at the IDE layer where you pass in teh min lang-version and feature option you care about. We'll then do these checks for you, and only call back into you if the min version is met, and the file is enabled for that feature option. etc.

333fred commented 7 months ago

API Review

Naming needs to be changed
- RegisterSemanticModelStartAction implies much more functionality - being able to register operation or symbol actions
- RegisterSyntaxTreeStartAction seems more appropriate
Could the AnalyzerDriver actually know about this and just not run the analyzers?
- Are most analyzers going to define their options in a way that can be applied per-file?
Did prototype, saved a few percent CPU gains. Not huge, not nothing.
Will this bloat the state tracking for analyzers?
- Manish looked into this, it didn't seem bad.

Conclusion: Needs work. We'll come back after looking at whether the analyzer driver itself can handle the optimization

CyrusNajmabadi commented 7 months ago

I talked to manish, we can't move this down into compiler layer because this has to understand 'options', which are complex and exist at the host layer. As an example, ti needs to understand how things like options configure their severity (Which involves the existing dedicate IDE syntax of things likd a=true:warning or b=false:hidden.

333fred commented 6 months ago

API Review

Manish looked into the details and we need this regardless for the IDE.
Should you be able to register operations? Seems like it
Should we allow it from SymbolStartAction as well?
- That seems weird. What does it mean for a symbol action to be disabled for part of a symbol?
- No current requests for this.
Because of the IOperation requests, RegisterSemanticModelAction seems like the right name.

Conclusion: Needs work. We'll review the final proposed API with the rename and IOperation APIs offline on email.

dotnet / roslyn

Add RegisterCompilationUnitStartAction to allow analyzers to immediately bail out from analyzing a file prior to processing nodes. #72053

API Review

API Review