dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.2k stars 1.57k forks source link

[vm_service] VmServerConnection does not handle SentinelExceptions as prescribed by the vm service interface #51752

Closed annagrin closed 1 year ago

annagrin commented 1 year ago

VmServerConnection seems to convert SentinelException to an RPC error. This has caused flakes in flutter web tests:

Parent issue: https://github.com/flutter/flutter/issues/121708

The issue above currently has a PR that resolves it for one case, but we should have a reliable solution that works for all vm service API and their clients, to prevent more hard-to-detect failures due to inconsistencies in checks.

Details

dwds' chrome proxy service throws a sentinel exception on getIsolate for non-existing isolate id, as required by the VmServiceInterface:

https://github.com/dart-lang/sdk/blob/67f3e1f4a0dfe57c07192607095892a9d107a9d1/pkg/vm_service/lib/src/vm_service.dart#L690

but the VmServerConnection translates it into an RPC error with RPCError.kInternalError as code, and the sentinel is added to the message as a string:

https://github.com/dart-lang/sdk/blob/67f3e1f4a0dfe57c07192607095892a9d107a9d1/pkg/vm_service/lib/src/vm_service.dart#L1795

The clients then need to check for the string contents to contain a sentinel kind. See https://github.com/flutter/flutter/pull/122776#discussion_r1138890204 for a client check example.

Should we change the VmServerConnection to throw something different, or have a set of error codes in package:vm_service we all agree on in servers and clients can check for in their code?

annagrin commented 1 year ago

/cc @bkonyi @derekxu16

derekxu16 commented 1 year ago

Here's my analysis of the problem:

flutter_tools passes the VmServerConnection URI that DWDS creates to createVmServiceDelegate here https://github.com/flutter/flutter/blob/23f5582ea654d01f9462f04158cf2c2a188708ec/packages/flutter_tools/lib/src/isolated/devfs_web.dart#L697

createVmServiceDelegate uses the URI to create a VmService object here https://github.com/flutter/flutter/blob/23f5582ea654d01f9462f04158cf2c2a188708ec/packages/flutter_tools/lib/src/vmservice.dart#L383-L390

VmService. _processResponse always interprets an error in the json reponse map as an RPCError https://github.com/dart-lang/sdk/blob/4faf18deafde27d2bc6c10cfec12b0ba85afaac5/pkg/vm_service/lib/src/vm_service.dart#L2444-L2445

I'm don't have a clear understanding of how VmServerConnection was intended to be used yet. Do we expect VmServerConnection to be used to create a VmService object like createVmServiceDelegate does? If so, then I guess we can check for SentinelException here https://github.com/dart-lang/sdk/blob/4faf18deafde27d2bc6c10cfec12b0ba85afaac5/pkg/vm_service/lib/src/vm_service.dart#L1795

and store SentinelExceptions in a way such that they can be detected and reconstructed here https://github.com/dart-lang/sdk/blob/4faf18deafde27d2bc6c10cfec12b0ba85afaac5/pkg/vm_service/lib/src/vm_service.dart#L2444-L2445

But maybe VmServerConnection is not intended to be used to create a VmService object, in which case we need to design a different solution.

bkonyi commented 1 year ago

A VmServerConnection is effectively a wrapper around the streams (e.g., a web socket in most cases) associated with a single VM service connection that handles routing of VM service requests in a VM service implementation (in this case, DWDS, which implements VmServiceInterface internally).

The issue here is that when the client is invoking getIsolate on an isolate that no longer exists, the VmServerConnection first parses the request, determines it needs to invoke getIsolate(...) and then calls the method. getIsolate(...) will throw a SentinelException which is caught and then converted to a kInternalError here, which isn't correct. We should instead be returning a Sentinel response which takes the following form:

{
  "jsonrpc": "2.0",
  "id": 123,
  "data": {
    "type": "Sentinel"
    "kind": "Collected",
    "valueAsString": "Sentinel <collected>"
  }
}
annagrin commented 1 year ago

@bkonyi I think I tried that in my experiments while fixing the flutter test flake, but a if i remember correctly it does not result in throwing exception from the VmServerConnection.getIsolate as the clients expect, instead it seems to just return a sentinel value. I tried throwing a sentinel exception as well but run into some problems converting it to json.

derekxu16 commented 1 year ago

Looks like returning a sentinel value as @bkonyi suggested does fix the problem, because VmService will convert it to a SentinelException here https://github.com/dart-lang/sdk/blob/4faf18deafde27d2bc6c10cfec12b0ba85afaac5/pkg/vm_service/lib/src/vm_service.dart#L2450

bkonyi commented 1 year ago

SentinelException isn't actually defined in the protocol and is something we added to package:vm_service to avoid needing to return dynamic from the RPCs that can get a Sentinel response. As @derekxu16 has mentioned, we explicitly check for a Sentinel response type in package:vm_service which we then convert to a SentinelException that is thrown.

derekxu16 commented 1 year ago

Fixed in package:vm_service v11.2.1

https://pub.dev/packages/vm_service/versions/11.2.1