Closed bkoelman closed 6 years ago
/cc @mavasani @AlekseyTs @333fred Is this a bug or by design?
I think this is intentional. That IBlockOperation corresponds to the BlockSyntax. Documentation says "Finally handler of the try." and it is the handler. IOperation tree is not an annotated syntax tree, it is a different tree and some syntax nodes do not have corresponding IOperation nodes. This is by design.
I'm okay with the current .Finally.Syntax
, just wanted to check because there have been many bugs in this area. Thanks for investigating.
IOperation tree is not an annotated syntax tree, it is a different tree and some syntax nodes do not have corresponding IOperation nodes. This is by design.
Opinions may have changed over time, but at the introduction of IOperation
it was presented as a language-agnostic, richer alternative for syntax. Similar to how symbols provide semantic information for declarations, operations would provide semantic information for executable code blocks. And not "for about 80% of the possible constructs in code blocks."
There has always been discussion (here and here) about how close operations should match with syntax. And lots of bugs were fixed over the past years.
I can understand that, due to time constraints, V1 only supports the most essential constructs, but I always assumed that the gaps would be filled in at a later time. Not ever getting the missing operations feels like a broken experience and a lost investment in using IOperation
in many analyzers that I have written. After all, we do not have a walker that traverses operations, and then also some syntax nodes for their lack of operation support.
Tagging @CyrusNajmabadi @heejaechang @mavasani @genlu @jinujoseph @srivatsn for thoughts.
I think the point is that we have IOperation nodes for things that have semantic meaning. We do intend to fill in the gaps of missing nodes for semantics that we haven't covered yet, including the ones in the PR you linked. The current goal is to enable dataflow analysis of code blocks to enable analyzers missed in the FxCop port, such as CA2000 or CA2100. After the control-flow graph and dataflow analysis frameworks have shipped we'll examine what's next for the API, and I certainly intend to push for full support of all missing nodes as part of that.
Completely agree with @333fred. We do intend to fill the semantic gaps soon. We have banked another big analysis feature on this API (flow analysis) and intend it to be complete and fully supported API. However, we never intended IOperation tree to be 1:1 match with syntax tree, it is a semantic tree and so you will have syntactic constructs which have no matching IOperation node.
I like how operations abstract away various syntactic details, such as redundant braces or various notations to initialize an array. And I'm not expecting all syntax nodes to have an operation counterpart. The subtle differences between C# and VB alone would make that unfeasible. But not having it cover all semantic language constructs makes using it a major pain. Allow me to explain with some examples.
Say I want to analyze assignments. So I create an operation walker and handle VariableDeclarator
, SimpleAssignment
, CompoundAssignment
, IncrementOrDecrement
and DeconstructionAssignment
. All works fine, until I realize it does not hit assignments via a let
expression inside a query. IOperation
does not support queries. So aside from an operation walker, I need to register callbacks for various syntax nodes for my analysis to be complete. And then the problem becomes relating the results from both sources to each other.
Another example. I want to analyze the target block when a checked/unchecked scope is entered. On the syntax level, there's a statement for that, but it does not appear in the operation tree. Knowing the current checked/unchecked state from an operation exists, but the entering of such a block is semantic information too, right? As a workaround I need to register for IBlockSyntax
and inspect the parent syntax to see if we are entering such a block. But it's quite expensive to run that on every block! Alternatively, I can switch to syntax walker and track the effective context myself (while it's already implemented in operations). Having an operation that indicates such a block is entered, along with the calculated effective context in the target block would help. And that same operation can be reused for checked expressions. Which reminds me of the difficulty to determine whether an operation represents an expression or a statement, something else that's painfully missing at the moment.
One more. I want to analyze the names of types, members, parameters, local functions, tuple elements etc. So I register for various symbol kinds and the operations VariableDeclarator
, Tuple
and AnonymousObjectCreation
. This all works, but again not for what happens in queries. So after a lot of trial-and-error, I discover I also need to register syntax analysis for FromClause
, JoinClause
, JoinIntoClause
, QueryContinuation
and LetClause
. But now I need to handle and special-case redundant braces, casts and all myself.
Looking back in time, let's forget about the sudden disappearance of IOperation.HasErrors
, Operation.Syntax
sometimes returning null
and the lack of exposing IOperation.WasCompilerGenerated
for a long time, but suffice it to say its been a bumpy ride. And I'm still hoping things will get better now that IOperation is no longer experimental.
I get that the team is currently focussed on getting CFG ready and am happy to hear additional operations will be addressed after that. I'd like you to also consider my scenarios above. When time permits, I'll be happy to contribute as I did in the past, but that must fit in the teams' design plans obviously.
I am +1 on making it easy to mix and match between different analysis hooks (operation, syntax, code blocks)
by the way, @bkoelman you could help us by sending us PR? :)
Tagging @dotnet/analyzer-ioperation
@bkoelman Thank you for the detailed comment on the missing pieces. Let me try to summarize your requests:
HasErrors
or IsInvalid
property to know about malformed code.Is there anything I missed out from your list? Otherwise, I will file issues for these or ensure they exist and we will prioritize these requests.
@heejaechang I can probably contribute some in a few weeks, though I prefer to wait until CFG has been merged. IsExpression/IsStatement would be a good start for me to pick up.
@mavasani thanks for hearing me! Those 4 are my biggest pain points. But... there is more... :)
Essentials:
CompilationWithAnalyzers.GetAnalyzerDiagnosticsAsync()
that also returns a boolean to indicate "during analysis, at least one unhandled exception was caught". Having richer info, such as which exception(s) per analyzer is nice-to-have.IMethodSymbol
. My current solution via syntax is tedious and probably incomplete. A reliable API that works on symbols for methods, fields and other places that may contain a code block (and which works for both blocks and expression bodies) would be very welcome.Nice-to-haves:
IReturnOperation
would be hit for a yield break
statement. While it is straightforward to determine from syntax, a boolean property IsYield
would be nice. Callers can then inspect this property, combined with operation.Type
and operation.ConstantValue
to distinguish between yield return
and yield break
without looking at syntax. Though also not very obvious, until you think of the return;
and return null;
cases. So to make it even easier, IsYield
could be replaced with a ReturnKind
enum { OrdinaryReturn
, YieldReturn
, YieldBreak
}.IBranchOperation
, when it represents a break
or continue
statement, exposed a reference to the parent loop that it is breaking from or continuing at.IOperation
in Syntax Visualizer would be very helpful!.Instance
being null
for static calls, which makes them difficult to use.IReturnOperation would be hit for a yield break statement. While it is straightforward to determine from syntax, a boolean property IsYield would be nice.
IReturnOperation has three distinct OperationKinds: Return, YieldBreak, YieldReturn.
@bkoelman CFG work merged into the dev15.8-preview3 branch yesterday, and should make it into master sometime today. I would also definitely encourage opening issue(s) for anything we're not yet tracking. A lot of our initial design for IOperation and CFG was driven by the idea that we'd come back and add convenience APIs for things in a community-driven manner, as we had a limited time budget for V1. So your feedback is very important for shaping the future direction of the API.
@AlekseyTs Thanks, looks like I missed those return kinds. Consider that one resolved.
Something else that I keep running into: compiler-generated operations have caused me nothing but lots of hard-to-trace bugs, because they typically have no location to report on. So VS reports them, and then you need to try to figure out which code fragment in the project is causing it. Or sometimes they report an invalid location. Example: a void lambda that has an implicit return, while a method does not. These come as surprises when just trying to get the basics right. My code is already sprinkled with ifs to skip over them, and yet they keep popping up. Therefore I have come to think implicit operations should never surface by default. Not in analyzer callbacks, not in operation.Children/Parent and not in operation visitors or walkers. Opt-in should be possible similar to context.ConfigureGeneratedCode (see https://github.com/dotnet/roslyn/issues/10214). There is one analyzer where I am interested in implicit conversions, where I then need to opt-in. Forgetting that is quickly discovered when building such an analyzer. It is a breaking change, though.
Something else that I keep running into: compiler-generated operations have caused me nothing but lots of hard-to-trace bugs
I think, no matter what, once you have an AST you're going to have impedence mismatch issues between the AST and your domain. While it would be attractive to want an ast perfectly suited to any domain, it's ultimately always going to be a problem for some customer.
have caused me nothing but lots of hard-to-trace bugs, because they typically have no location to report on.
Isn't there IsImplicit
for this sort of thing?
Additionally, all Operations should have a corresponding syntax, even if they're implicit. Are you saying that you're encountering nodes with no syntax?
have no location to report on.
Sorry, I should have clarified that. The compiler-generated operations (.IsImplicit == true
) actually do have syntax and a location. But the analyzer I am referring to reports on the return
keyword instead of the whole returned expression. Because when a multi-line lambda expression is returned, the squiggle becomes very long and disturbing (and visually hides any nested diagnostics). So the analyzer looks at the syntax type (ReturnStatementSyntax
or YieldStatementSyntax
) to find the location for the return
or yield return
keyword span. Being unable to, it results in Location.None
. Now I am convinced that the analyzer should not report in that case. But at that time, I thought such a situation could never happen.
@CyrusNajmabadi True that a tree will never fit all needs. But I think it helps to have defaults that do "the expected thing" for unexperienced API consumers that have no knowledge of roslyn internals. Like me, back then, when I was unaware that compiler-generated nodes even existed. Not getting them keeps life simple for the simple things. Advanced scenarios like analyzing allocations requires such deep knowledge of the language that I think it would be okay for additional work to be needed to get that right.
@bkoelman I'm not sure that makes sense in many cases. There are a good number of scenarios where implicit nodes are very, very necessary to understand anything about the semantics. Take implicit this
or base
calls, for example. It wouldn't make any sense to have an IInvocationOperation
with a null InstanceReceiver
because the receiver was implicit. I think that consumers of this API may well be surprised when nodes crop up, but those surprises are how the language works and if they want their analyzers to be accurate, they need to handle those scenarios.
There are some cases where there are inconsistencies, like the lambda case you pointed out. But I don't think that's an argument for hiding implicit nodes; rather, I think it's an argument for fixing inconsistencies.
But I think it helps to have defaults that do "the expected thing" for unexperienced API consumers that have no knowledge of roslyn internals.
We err'ed in a different direction. These are supposed to represent the "operations" that actually become the executable code at the end of the day. So, it's important to represent all these things. Even if they're things that the user doesn't write but which teh compiler inserts.
This was also important because a core customer was one that would be looking for issues regardless if those were backed by actual syntax the user wrote, or if they were things the compiler generated itself.
It wouldn't make any sense to have an
IInvocationOperation
with a nullInstanceReceiver
because the receiver was implicit.
I agree. Properties of operations should not be null when they refer to an implicit operation. My proposal is to (based on configuration during analyzer init):
operation.Children
One way to overcome the first two without roslyn changes:
public static class OperationExtensions
{
public static IEnumerable<IOperation> GetExplicitChildren(IEnumerable<IOperation> children)
{
foreach (IOperation child in children)
{
if (!child.IsImplicit)
{
yield return child;
}
}
}
}
public class ExplicitOperationVisitor : OperationVisitor
{
public override void Visit(IOperation operation)
{
if (operation != null && !operation.IsImplicit)
{
operation.Accept(this);
}
}
}
public abstract class ExplicitOperationVisitor<TArgument, TResult> : OperationVisitor<TArgument, TResult>
{
public override TResult Visit(IOperation operation, TArgument argument)
{
if (operation != null && !operation.IsImplicit)
{
return operation.Accept(this, argument);
}
return default(TResult);
}
}
public abstract class ExplicitOperationWalker : OperationWalker
{
public override void Visit(IOperation operation)
{
if (operation != null && !operation.IsImplicit)
{
base.Visit(operation);
}
}
}
This works, however requires some internal knowledge to come up with and is not configurable.
To compare, I investigated how IMethodSymbol
behaves for a compiler-generated ctor:
context.RegisterSymbolAction
for SymbolKind.Method
does not get calledINamedType.GetMembers()
includes the compiler-generated ctorSymbolVisitor
visits the compiler-generated ctorFor consistency with symbols, at the very least I think there should be a way (if not by default) to prevent analyzer callbacks for implicit operations. This is basically what question 1 in #10214 is about.
When an analyzer throws, the exception is not propagated into a unit test.
I found a workaround in StyleCop which I am now using: special-case when AD0001 is returned. So this one is not relevant anymore.
When an analyzer throws, the exception is not propagated into a unit test.
@bkoelman Can you clarify this? We have a helper method GetAnalyzerDiagnostics that has an optional parameter Action<Exception, DiagnosticAnalyzer, Diagnostic> onAnalyzerException = null
, which does propagate the exception info if the caller desires. If you are creating your own CompilationWithAnalyzers
instance, you can pass in such a delegate to the constructor of CompilationWithAnalyzerOptions
It would be nice if IBranchOperation, when it represents a break or continue statement, exposed a reference to the parent loop that it is breaking from or continuing at.
IBranchOperation.Target points the the destination label symbol. ILoopOperation now exposes the break and continue label symbols and ISwitchOperation exposes an exit lable, so you should be able to match the branches with parent loop/switch if desired. I filed https://github.com/dotnet/roslyn/issues/28095 to track this request.
An API to get operation block(s) from an IMethodSymbol
HasErrors or IsInvalid property to know about malformed code.
We have decided against exposing any such information directly on operations nodes and also any extension method for it:
HasErrors
or IsInvalid
can have vastly different semantics based on client requirements. Is the client looking for syntax errors or semantic errors? Do they want to know of a diagnostic of a specific severity or above or any diagnostic? Do they want to know if the diagnostic is on the span related to this operation only but not on any descendant (for example, if there is a diagnostic on an expression, should this invalid flag propagate to ancestors or not)?GetDiagnostics
API. We can work on optimizing these, if we are clear on the required semantics, but again this is non-trivial work.Our recommended approach for clients is to instead check for validity of data such as type exposed directly on the operation node and child nodes (for example, is one of the expected non-null child operation null). If their intent is to strictly operate on operations which have no compiler diagnostics, they should call the relevant GetDiagnostics
API, keeping in mind the associated perf implications.
@bkoelman Thanks a lot for your feedback, and your willingness to help us out by contributing in this area. I believe all of your requests or concerns in this issue have now either been addressed or have a separate specific issue tracking them. If you feel something has been missed or disagree with some statements/claims made here, please file a separate issue to discuss and track that work. I am closing this issue as the original issue filed here is by design.
We have a helper method GetAnalyzerDiagnostics...
My project has a NuGet reference to Microsoft.CodeAnalysis v2.8.2, which does not contain type Microsoft.CodeAnalysis.DiagnosticExtensions
. According to source browser, it lives in a project called TestUtilities. Maybe that does not get distributed via NuGet?
For reference, I am using this code, which I copied years ago from another repo.
@bkoelman Seems like your code here can be changed to use this overload of WithAnalyzers
API. The CompilationWithAnalyzersOptions
wrapping the existing AnalyzerOptions
takes an optional delegate to callback on analyzer exception.
Thanks for pointing this out. It works when I marshal the exception from the callback to the thread that the unittest is running on (which is different from the thread that the analyzer and callback runs on).
Version Used: Microsoft.CodeAnalysis v2.8.0
Steps to Reproduce:
ITryOperation
for the nexttry
statement: