dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.1k stars 1.56k forks source link

Exceptions thrown by network stack (websocket/http/socket) frequently lack stack traces #44994

Open jonahwilliams opened 3 years ago

jonahwilliams commented 3 years ago

The dart websocket/http/socket libraries frequently throw errors with no meaningful stack traces attached. This makes it difficult to identify parts of the flutter_tool code that need additional error handling. Generally what I'd like is for errors thrown during the course of a particular request to contain a stack trace that was created where the user code initiated the request.

This can easily happen if a Completer is completed with an error but no stack trace, though there are other constructs that have a similar effect:

A related example from the socket_impl: the following code does not return a meaningful stack trace, printing an empty string for the stack trace.

import 'dart:io';

Future<void> main() async {
  try {
    await foo();
  } catch (err, st) {
    print('err: $err, st: $st');
   }
}

Future<void> foo() async {
  await WebSocket.connect('ws://localhost:8181/ws'); // Nothing actually running here
}

Some of the other errors are harder to hit, but hopefully the SDK has an existing http test suite that could be exercised here.

zanderso commented 3 years ago

/cc @aam

aam commented 3 years ago

One challenge with providing good stack traces for io api calls is async nature of the io request handling. At the point when we are ready to report an error and create an exception https://github.com/dart-lang/sdk/blob/master/sdk/lib/_internal/vm/bin/socket_patch.dart#L1293 we are running off a ReceivePort listener, which has no access to stack where original io request was made:

OS Error: Connection refused, errno = 61, , #0      _NativeSocket.createError (dart:io-patch/socket_patch.dart:1421:55)
#1      _NativeSocket.reportError (dart:io-patch/socket_patch.dart:1446:9)
#2      _NativeSocket.multiplex (dart:io-patch/socket_patch.dart:1293:13)
#3      _RawReceivePortImpl._handleMessage (dart:isolate-patch/isolate_patch.dart:184:12)

@rmacnak-google @cskau-g for thoughts on somehow recovering async stack trace

zanderso commented 3 years ago

One approach is to capture a useful backtrace in the synchronous context: https://github.com/dart-lang/sdk/commit/d4d3e3698c8bf2ede46b52438874890564f12113. This is often most useful to the caller since it shows what the application did that led to the exception.

aam commented 3 years ago

Yeah, but capturing stack traces is anticipation of potential errors seems rather expensive .

zanderso commented 3 years ago

@aam Does golem have any benchmarks of socket IO that would give an idea of the cost?

aam commented 3 years ago

Yes there are some io http/socket benchmarks available. I guess If we are to make modest steps like fixing stack trace for the (already quite expensive) connect(https://dart-review.googlesource.com/c/sdk/+/202802), the impact of capturing stack trace might be unnoticeable. I will run this against benchmark suite.

ghost commented 3 years ago

+1 to what Alex is saying; benchmarks for --lazy-async-stacks showed an upwards 90% reduction in runtime for async calls, so the price for eagerly collecting stack traces for every call is high.

zanderso commented 3 years ago

@cskau-g The end goal is having stack traces attached to these exceptions that we can tie back to a line of application code. If you can take a look at @aam's patch and suggest a better way to do that, that would be helpful. Thanks!

a-siva commented 3 years ago

I guess If we are to make modest steps like fixing stack trace for the (already quite expensive) connect(https://dart-review.googlesource.com/c/sdk/+/202802), the impact of capturing stack trace might be unnoticeable. I will run this against benchmark suite.

What was the result of running this patch against the benchmark suite ?

aam commented 3 years ago

What was the result of running this patch against the benchmark suite ?

I didn't notice any impact (you can search for aam-connect-stacktrace-1 on the suite for the results).

aam commented 3 years ago

What was the result of running this patch against the benchmark suite ?

I didn't notice any impact (you can search for aam-connect-stacktrace-1 on the suite for the results).

After the change landed automated email notified of the regression in two benchmarks

HelloWorldHttpServer | -9.6 % | IOPerSecond
HelloWorldHttpServer | -9.83 % | IOPerSecond

I haven't look at how that benchmark is setup, why stack trace collection makes such a big contribution to its IOPerSecond.

christopherfujino commented 1 year ago

@aam are there any blockers to this one? We're still seeing this in crash reporting.

andrewkolos commented 1 month ago

Reading this thread, it's unclear whether there might be a way to preserve stack traces without a significant performance penalty. Judging from https://github.com/dart-lang/sdk/issues/44994#issuecomment-858807944, it appears there would be a significant penalty, but it seems the reason(s) for that penalty are not well understood. This makes me wonder if there is still hope for us getting these stack traces. Please let me know if I have misunderstood.

I bring this up because https://github.com/flutter/flutter/issues/153298 has come to represent 11.76% of flutter tool crashes (and this figure will likely double once the fix for the top crasher, https://github.com/flutter/flutter/pull/153294, lands in stable). Getting stack traces here would be a huge help to troubleshooting these crashes.

cc @aam @a-siva

a-siva commented 1 month ago

The specific change referenced above https://dart-review.googlesource.com/c/sdk/+/202802 was merged a while back.

Are you referring to other io calls that require similar stacktraces?

andrewkolos commented 1 month ago

The specific change referenced above https://dart-review.googlesource.com/c/sdk/+/202802 was merged a while back.

Are you referring to other io calls that require similar stacktraces?

Ah, I failed to realize this was an umbrella issue. My apologies. Yes, I am referring to another io call. I've filed https://github.com/dart-lang/sdk/issues/56474 with more details.