dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.21k stars 1.57k forks source link

`dart analyze` says "No issues found!" even if analyzer crashes #49931

Closed stereotype441 closed 5 months ago

stereotype441 commented 2 years ago

The following program crashes the analyzer:

foo() {
  void Function({@deprecated int i}) b = bar;
  print(b);
  b(i: 1);
}

bar({@deprecated int i = 0}) {}

I can tell that it crashes the analyzer, because (a) if I analyze it in my IDE, all IDE features that rely on the analysis server stop working, and (b) if I run the analyzer directly, using out/ReleaseX64/dart-sdk/bin/dart out/ReleaseX64/gen/dartanalyzer.dart.snapshot, it prints out a stacktrace.

However, if I analyze the program using dart analyze, it says:

Analyzing test.dart...
The analysis server timed out while shutting down.
No issues found!
The analysis server shut down unexpectedly.
Please report this at dartbug.com.

I think it's really misleading to print No issues found! in the case where the analyzer crashed; it would be really for a user to ignore the other text and assume their code was free of errors.

stereotype441 commented 8 months ago

I ran into this issue again today and it caused me a lot of confusion. I was trying to study a test case that I expected to produce a compile-time error and the analyzer was saying "No issues found!". Eventually I figured out that I was running the version of the analyzer that's distributed to internal Google machines, and that this version crashes if the user's LOAS2 certificate has expired (internal issue: b/326991790).

DanTup commented 7 months ago

Seems like the error for the code above on the latest code is actually not a crash. This may have changed since the above, since the output is different - the current output doesn't suggest anything went wrong at all:

Analyzing sdk_49931...
No issues found!

And if I print the traffic there's also no indication of a failure:

Analyzing sdk_49931...
=STDIN=> {"id":"1","method":"server.setSubscriptions","params":{"subscriptions":["STATUS"]}}
=STDIN=> {"id":"2","method":"analysis.setAnalysisRoots","params":{"included":["C:\\Dev\\Test Projects\\sdk_49931"],"excluded":[]}}
<=STDOUT= {"event":"server.connected","params":{"version":"1.36.0","pid":3876}}
<=STDOUT= {"id":"1"}
<=STDOUT= {"event":"analysis.errors","params":{"file":"C:\\Dev\\Test Projects\\sdk_49931\\analysis_options.yaml","errors":[]}}
<=STDOUT= {"event":"analysis.errors","params":{"file":"C:\\Dev\\Test Projects\\sdk_49931\\pubspec.yaml","errors":[]}}
<=STDOUT= {"event":"server.status","params":{"analysis":{"isAnalyzing":true}}}
<=STDOUT= {"id":"2"}
<=STDOUT= {"event":"server.status","params":{"analysis":{"isAnalyzing":false}}}
=STDIN=> {"id":"3","method":"server.shutdown","params":null}
<=STDOUT= {"event":"analysis.flushResults","params":{"files":["C:\\Dev\\Test Projects\\sdk_49931\\analysis_options.yaml","C:\\Dev\\Test Projects\\sdk_49931\\pubspec.yaml"]}}
<=STDOUT= {"id":"3"}
No issues found!

If I enable the instrumentation log when using LSP I see the real exception (complete log here):

1709306238913:Ex:Analysis failed:: C::\Dev\Test Projects\sdk_49931\bin\sdk_49931.dart context:: exception_20240301_151718_913
type 'GenericFunctionTypeElementImpl' is not a subtype of type 'ExecutableElement' in type cast
#0      BaseDeprecatedMemberUseVerifier._isLocalParameter (package::analyzer/src/error/deprecated_member_use_verifier.dart::278::55)

I think I've seen this before - when analysis fails in this way, we reject the specific requests (for example if a Hover was sent it would return an error) but the errors aren't reported to the client during analysis. The code that "silently" handles this exception is here:

https://github.com/dart-lang/sdk/blob/152703335850b14728dea1ede8c8e1b69ca95d52/pkg/analysis_server/lib/src/analysis_server.dart#L726-L731

And references https://github.com/dart-lang/sdk/issues/39284. I guess the reason for this was to avoid spamming the user with errors, and in the case that we're running in an IDE we don't want to shut down the whole server just because a single file failed to analyse (maybe the user will modify the file and the issue goes away).

So, it's not totally clear what the fix is here..

We could report it as a diagnostic for the file - that would show up to the user in IDEs and also in the output here.. however, I don't know if this could cascade and the user end up with this error repeated for a large number of files if they happened to import the "broken" file.

Another option could be that we send it to the client (although in what kind of event?) with some throttling (eg. we don't send if we haven't sent one in the last x seconds/minutes).

@bwilkerson any thoughts?

(@stereotype441 btw, I don't know if this way of failing is the same as the one you saw this week, so it's possible that fixing the above won't fix that, so there might be an additional issue to resolve here.. do you happen to have any more info about the way it's failing and/or the output for that case?).

bwilkerson commented 7 months ago

First, if the user has opted in to analytics and we hit https://github.com/dart-lang/sdk/blob/152703335850b14728dea1ede8c8e1b69ca95d52/pkg/analysis_server/lib/src/analysis_server.dart#L726-L731 the instrumentationService should be, directly or indirectly, a CrashReportingInstrumentation so that the exception is being reported. If that's not the case, then we need to fix that so that we're aware of bugs in our tools and can fix them.

When we're running from the command-line I think the behavior we want is to report to the user

Using the legacy protocol we could do this using a server.error notification. If there are multiple exceptions then I suspect that it would be sufficient to report the first given that we don't have a reliable way to determine whether or not subsequent exceptions were a result of the first.

When running in an IDE I think the behavior is a little harder to get right. I think that's largely a result of the unknown impact of an exception on the state of the server. Sometimes it would be better for the user if we just stopped the server; sometimes it would be better if the server continues to run. Sometimes restarting the server would help, sometimes it would just result in an infinite restart-crash cycle.

While it seems reasonable to tell the user when the analysis results might be wrong, we need to consider whether there's a signal that we could use to tell the user when the analysis is once again trustworthy.

I don't like the idea of generating a diagnostic when an exception is hit because it's completely non-actionable from the user's perspective. That said, it has the benefit that when the file is re-analyzed, if we don't hit the exception, then the diagnostic will go away. I don't know whether we'd always know which library to report the diagnostic against given that we analyze a whole cycle at a time.

We might be able to do something similar that might be better. We could report, after each library cycle analysis, whether there were any exceptions in any of the libraries, report to the user when the first exception is reported, and report to the user again when the last exception is cleared. I don't know how complicated it would be to implement.

I don't know if this could cascade and the user end up with this error repeated for a large number of files if they happened to import the "broken" file.

I don't know either, but it does seem likely that the exception will leave the element model in an incomplete state and that there could be false diagnostics reported because of the state of the element model.

DanTup commented 7 months ago

First, if the user has opted in to analytics and we hit ... so that the exception is being reported

I added some logging to CrashReportingInstrumentation._sendServerReport and it showed up in the instrumentation log:

1709816095136:Info:Sending report to serverReports (null):: type 'GenericFunctionTypeElementImpl' is not a subtype of type 'ExecutableElement' in type cast

However, that's gated on a check to analyticsManager.analytics.okToSend, and that is always false for NoOpAnalytics which is what we always end up with here because of this check:

https://github.com/dart-lang/sdk/blob/7e2679df0596972cae9854032dbf0db987832cff/pkg/analysis_server/lib/src/server/driver.dart#L232-L237

In this case, client-id is dart-$commandName. So there are probably reports from VS Code users, but not from CLI or other IDE users. I don't know what the right fix is there - should we add dart-analyze to that condition, or should we try to eliminate the condition? (having to map incoming client-ids seems unfortunate... if other tools are using the LSP server and sending a good client-id, we'd never know about them unless they know to come and ask to be mapped here)

We might be able to do something similar that might be better. We could report, after each library cycle analysis, whether there were any exceptions in any of the libraries, report to the user when the first exception is reported, and report to the user again when the last exception is cleared. I don't know how complicated it would be to implement.

I think I understand the suggestion, but I also don't know how easy that'd be - in part because I'm not too familiar with the analysis of these library cycles. It seems like we'd need to track this status against a library cycle, but modifying code can presumably alter those, so I'm not sure how we'd manage that.

As a short-term fix, perhaps dart analyze could send some flag that makes the errors that go through the code above non-silent (--no-silent-errors or some env variable or something)? That would avoid changing the IDE behaviour (risking spamming or confusing users more than they may be already) but not hide the error from dart analyze.

Another possibility could be for dart analyze to ask for/pull the errors instead of getting them in subscriptions (some kind of custom request). This could also get avoid dart analyze needing to deal with the analysis status notifications, because the handling of the request could just wait for all in-progress analysis to complete before returning a result (or in the case of any analysis errors, return a failure). It does mean adding some dart analyze specific handling to the server though (LSP does have a pull-diagnostics request that we haven't yet implemented, but it's per-file).

bwilkerson commented 7 months ago

I don't know what the right fix is there ...

I don't either. @mit-mit I'm hoping you can provide some guidance.

... having to map incoming client-ids seems unfortunate...

I agree. I think the only restriction we have is that we need to be able to display the "you can opt out" message to users, so this shouldn't be gated by the client-id, it should be gated by the client's capabilities. This looks like some initial implementation work that should have been fixed. (The rules might be different for the command-line tools.)

It seems like we'd need to track this status against a library cycle, but modifying code can presumably alter those, so I'm not sure how we'd manage that.

When I typed "We could report, after each library cycle analysis," that should have been "We could record, after each library cycle analysis,". The thought, though, was to track each individual library in the cycle separately, but to do the recording after analyzing a cycle. If the content of the cycles changes then we'll re-analyze all of the libraries in their new cycles and update the record accordingly. I also glossed over details like the need to clear the record when a library is deleted.

As a short-term fix, perhaps dart analyze could send some flag ...

That depends on whether the permissions for sending this kind of data comes from the dart tool or from the analysis server. I don't understand the criteria used to make that choice, so I can never remember which way we decided.

It does mean adding some dart analyze specific handling to the server ...

The legacy protocol (which is what we're currently using) also has a pull model in the form of analysis.getErrors. I'm not sure how either protocol handles the case where a file is excluded but the client explicitly asks for errors, and I don't want the command-line tool to have to understand the analysis options file. But yes, it's an option worth considering.

pq commented 7 months ago

Friendly triage ping: any updates on this one?

DanTup commented 7 months ago

@pq

Sorry, no progress. There's an open question above about analytics (these crashes are going unnoticed), but I think the fix here is going to be a little more involved than first thought (it's not just handling the server crashing).

@bwilkerson

As a short-term fix, perhaps dart analyze could send some flag ...

That depends on whether the permissions for sending this kind of data comes from the dart tool or from the analysis server. I don't understand the criteria used to make that choice, so I can never remember which way we decided.

I'm not sure I understand what "permission" means here but I think my suggestion might not have been very clear. What I meant was unrelated to analytics, but whether dart analyze could start the server with some flag that could be used in ErrorNotifier.logException to not skip sending the errors to the client (dart analyze) when they are "SilentExceptions".

Specifically, this condition here:

https://github.com/dart-lang/sdk/blob/f7d5d0cfbeaade73f322811da5e1527cd053201f/pkg/analysis_server/lib/src/server/error_notifier.dart#L21-L24

This condition means that the analysis failure is not sent to the client. If the client (dart analyze) had indicated that it wants to know about "silent exceptions" and we sent it here, then I think dart analyze could handle this case better and show the user a better message.

There's probably still a question about whether all of these errors should be sent or only the first.. but if we send them all, dart analyze could choose what to show to the user.

bwilkerson commented 7 months ago

... whether dart analyze could start the server with some flag ...

Sorry for the misunderstanding.

Yes, it would be fine for the command-line tool to use either a command-line flag or an optional field in a request to cause the server to behave the way it needs to in order to make this all work correctly.

DanTup commented 7 months ago

With the changes in https://dart-review.googlesource.com/c/sdk/+/358902, the provided example no longer reports "No errors found!" and instead prints the exception (this was existing code, it just triggers):

PS C:\Dev\Test Projects\sdk_49931> C:\Dev\Google\dart-sdk\sdk\out\ReleaseX64\dart-sdk\bin\dart.exe analyze
Analyzing sdk_49931...
Error from the analysis server: Internal error: type 'GenericFunctionTypeElementImpl' is not a subtype of type 'ExecutableElement' in type cast
#0      BaseDeprecatedMemberUseVerifier._isLocalParameter (package:analyzer/src/error/deprecated_member_use_verifier.dart:282:55)
#1      BaseDeprecatedMemberUseVerifier._checkForDeprecated (package:analyzer/src/error/deprecated_member_use_verifier.dart:161:9)
#2      BaseDeprecatedMemberUseVerifier._simpleIdentifier (package:analyzer/src/error/deprecated_member_use_verifier.dart:232:5)
#3      BaseDeprecatedMemberUseVerifier.simpleIdentifier (package:analyzer/src/error/deprecated_member_use_verifier.dart:141:5)
#4      BestPracticesVerifier.visitSimpleIdentifier (package:analyzer/src/error/best_practices_verifier.dart:735:25)
#5      SimpleIdentifierImpl.accept (package:analyzer/src/dart/ast/ast.dart:16223:50)
#6      LabelImpl.visitChildren (package:analyzer/src/dart/ast/ast.dart:11110:12)
#7      RecursiveAstVisitor.visitLabel (package:analyzer/dart/ast/visitor.dart:1314:10)
#8      LabelImpl.accept (package:analyzer/src/dart/ast/ast.dart:11106:50)
#9      NamedExpressionImpl.visitChildren (package:analyzer/src/dart/ast/ast.dart:12610:11)
#10     RecursiveAstVisitor.visitNamedExpression (package:analyzer/dart/ast/visitor.dart:1404:10)
#11     NamedExpressionImpl.accept (package:analyzer/src/dart/ast/ast.dart:12601:50)
#12     NodeListImpl.accept (package:analyzer/src/dart/ast/ast.dart:13034:20)
#13     ArgumentListImpl.visitChildren (package:analyzer/src/dart/ast/ast.dart:510:16)
#14     RecursiveAstVisitor.visitArgumentList (package:analyzer/dart/ast/visitor.dart:810:10)
#15     ArgumentListImpl.accept (package:analyzer/src/dart/ast/ast.dart:506:50)
#16     FunctionExpressionInvocationImpl.visitChildren (package:analyzer/src/dart/ast/ast.dart:8637:19)
#17     RecursiveAstVisitor.visitFunctionExpressionInvocation (package:analyzer/dart/ast/visitor.dart:1194:10)
#18     BestPracticesVerifier.visitFunctionExpressionInvocation (package:analyzer/src/error/best_practices_verifier.dart:495:11)
#19     FunctionExpressionInvocationImpl.accept (package:analyzer/src/dart/ast/ast.dart:8626:15)
#20     ExpressionStatementImpl.visitChildren (package:analyzer/src/dart/ast/ast.dart:6379:17)
#21     RecursiveAstVisitor.visitExpressionStatement (package:analyzer/dart/ast/visitor.dart:1080:10)
#22     ExpressionStatementImpl.accept (package:analyzer/src/dart/ast/ast.dart:6375:50)
#23     NodeListImpl.accept (package:analyzer/src/dart/ast/ast.dart:13034:20)
#24     BlockImpl.visitChildren (package:analyzer/src/dart/ast/ast.dart:1979:17)
#25     RecursiveAstVisitor.visitBlock (package:analyzer/dart/ast/visitor.dart:864:10)
#26     BlockImpl.accept (package:analyzer/src/dart/ast/ast.dart:1975:50)
#27     BlockFunctionBodyImpl.visitChildren (package:analyzer/src/dart/ast/ast.dart:1930:12)
#28     RecursiveAstVisitor.visitBlockFunctionBody (package:analyzer/dart/ast/visitor.dart:870:10)
#29     BlockFunctionBodyImpl.accept (package:analyzer/src/dart/ast/ast.dart:1922:50)
#30     FunctionExpressionImpl.visitChildren (package:analyzer/src/dart/ast/ast.dart:8547:11)
#31     RecursiveAstVisitor.visitFunctionExpression (package:analyzer/dart/ast/visitor.dart:1188:10)
#32     BestPracticesVerifier.visitFunctionExpression (package:analyzer/src/error/best_practices_verifier.dart:489:11)
#33     FunctionExpressionImpl.accept (package:analyzer/src/dart/ast/ast.dart:8536:50)
#34     FunctionDeclarationImpl.visitChildren (package:analyzer/src/dart/ast/ast.dart:8381:25)
#35     RecursiveAstVisitor.visitFunctionDeclaration (package:analyzer/dart/ast/visitor.dart:1176:10)
#36     BestPracticesVerifier.visitFunctionDeclaration (package:analyzer/src/error/best_practices_verifier.dart:468:13)
#37     FunctionDeclarationImpl.accept (package:analyzer/src/dart/ast/ast.dart:8375:50)
#38     NodeListImpl.accept (package:analyzer/src/dart/ast/ast.dart:13034:20)
#39     CompilationUnitImpl.visitChildren (package:analyzer/src/dart/ast/ast.dart:3650:21)
#40     RecursiveAstVisitor.visitCompilationUnit (package:analyzer/dart/ast/visitor.dart:942:10)
#41     CompilationUnitImpl.accept (package:analyzer/src/dart/ast/ast.dart:3643:50)
#42     LibraryAnalyzer._computeWarnings (package:analyzer/src/dart/analysis/library_analyzer.dart:446:10)
#43     LibraryAnalyzer._computeDiagnostics (package:analyzer/src/dart/analysis/library_analyzer.dart:312:9)
#44     LibraryAnalyzer.analyze (package:analyzer/src/dart/analysis/library_analyzer.dart:106:5)
#45     AnalysisDriver._analyzeFile.<anonymous closure> (package:analyzer/src/dart/analysis/driver.dart:1366:11)
<asynchronous suspension>
#46     PerformanceLog.runAsync (package:analyzer/src/dart/analysis/performance_logger.dart:50:14)
<asynchronous suspension>
#47     AnalysisDriver._produceErrors (package:analyzer/src/dart/analysis/driver.dart:1952:5)
<asynchronous suspension>
#48     AnalysisDriver.performWork (package:analyzer/src/dart/analysis/driver.dart:1236:7)
<asynchronous suspension>
#49     AnalysisDriverScheduler._run (package:analyzer/src/dart/analysis/driver.dart:2374:7)
<asynchronous suspension>

null
PS C:\Dev\Test Projects\sdk_49931>

(@stereotype441 I'm unable to confirm if the new crash you see is of the same kind, so you might have to test this to see if there's still another outstanding issue)

It turned out there were multiple reasons why errors weren't being sent to the client.. In addition to being "silent", the server reference was not set correctly in ErrorNotifier, and ErrorNotifier was never added to the set of instrumentation services (see the commit message on https://dart-review.googlesource.com/c/sdk/+/358902 for more details).

Because of this, there's the possibility that users weren't seeing other (non-silent) exceptions before that we expected them to, so we might see an increase in user reports of errors with that change (although I believe these would all be exceptions that are already occurring and being sent to crash reporting when triggered from VS Code).

I've opened https://github.com/dart-lang/sdk/issues/55260 for the other issue about analytics/crash reports not happening for these crashes from dart analyze.

stereotype441 commented 7 months ago

(@stereotype441 I'm unable to confirm if the new crash you see is of the same kind, so you might have to test this to see if there's still another outstanding issue)

I patched in your change, and unfortunately it doesn't seem to address the internal use case I ran into ☹️. Probably we should go ahead and land your change, and then let someone with internal access follow up on my issue. I'll add additional details of my findings to the internal bug report.

srawlins commented 6 months ago

@DanTup is that the right next step?

DanTup commented 6 months ago

Yes, I think so. https://dart-review.googlesource.com/c/sdk/+/358902 hasn't landed yet (I think it needs another review - @stereotype441 is listed as a reviewer because he triggered the bots but I don't know if he wants to review or someone else on the analyzer team?).

Once that lands, someone with internal access will need to debug the other issue.

bwilkerson commented 6 months ago

I pinged Paul to see whether he can review the CL. If not, I'll fine someone else. I'm planning on closing this issue after the CL lands because any additional work should probably be tracked separately.

pq commented 6 months ago

Any fresh updates here? Is this still a P1?

bwilkerson commented 5 months ago

Any progress?

DanTup commented 5 months ago

@bwilkerson I'm not sure who the question is for. Above, you mentioned finding someone else to review if Paul isn't able to.

However, I noticed the CL showed a merge conflict today (due to changing some lines around the same place as 017bed7b9adc901fb50990bc83f492ab529db6a3 did to change final->var) so I've rebased to fix that.

bwilkerson commented 5 months ago

Thanks. Working on getting that second review...