dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.08k stars 1.56k forks source link

Issues with Socket and SecureSocket error handling and documentation #55978

Open larssn opened 3 months ago

larssn commented 3 months ago

Hello,

I'm finding the fact that Socket and SecureSocket are based on Stream, error-prone, and hard to work with. I've included an example below that shows two different errors.

Problem Description

Inspecting the documentation for socket.write, it states:

This operation is non-blocking. See [flush] or [done] for how to get any errors generated by this call.

The documentation for flush states:

... This method must not be called while an [addStream] is incomplete. NOTE: This is not necessarily the same as the data being flushed by the operating system.

There are several issues here:

  1. It's unclear if any of the related API calls use addStream under the hood, making it hard to know when it's safe to call flush.
  2. The second part of the flush documentation is vague. Does it mean that flush might not always work depending on the OS? It's not clear.

Example Code

Before running the example, uncomment Error block 1 or 2.

import 'dart:async';
import 'dart:io';

void main() async {
  final addr = InternetAddress('0.0.0.0', type: InternetAddressType.IPv4);

  final server = MyServer(addr);
  await server.create();
  server.listen();

  final client = MyClient(addr);
  await client.connect();
  print('Done');
}

class MyServer {
  MyServer(this._addr);

  late final ServerSocket _server;
  final InternetAddress _addr;

  Future<void> create() async {
    _server = await ServerSocket.bind(_addr, 7443);
    print('Server started');
  }

  /// Starts the server.
  Future<void> listen() async {
    await for (final socket in _server) {
      print('Incoming request');

      socket.listen((data) async {
        try {
          print('Sending response');

          // !! UNCOMMENT ONE OF THE BLOCKS BELOW !!

          // Error 1: Bad state: StreamSink is bound to a stream
          // Crashes at `socket.write('Goodbye');`, which is never sent to the client.
          // socket.write('Hello');
          // socket.flush();
          // socket.write('Goodbye');

          // Error 2: Bad state: StreamSink is closed
          // Crashes at `scheduleMicrotask(() => socket.write('Goodbye'));`,
          // which is never sent to the client.
          // scheduleMicrotask(() => socket.write('Hello'));
          // await socket.flush();
          // scheduleMicrotask(() => socket.write('Goodbye'));

          await socket.close();

          print('Client request completed, response sent');
        } catch (err, stackTrace) {
          print(err);
          print(stackTrace);
        }
      }, onDone: () {
        print('Socket done');
      });
    }

    print('Server stopped');
  }
}

class MyClient {
  MyClient(this._addr);

  final InternetAddress _addr;

  Future<void> connect() async {
    final s =
        await Socket.connect(_addr, 7443, timeout: const Duration(seconds: 6));
    print('[C] Connected to server');

    s.write('Start');

    await for (final data in s) {
      print(String.fromCharCodes(data));
    }
  }
}

While the above example is somewhat contrived, it illustrates how little it takes for errors to occur seemingly unrelated to the current operation. Writing to a socket should not leak internal stream-related errors. Is write using addStream under the hood? It's not clear since it's all external code.

Additionally, nowhere is it mentioned that write can produce errors on its own.

This makes it very hard to work with. I've encountered production code crashing at write or flush, seemingly out of nowhere (maybe because addStream was incomplete?), making both methods unreliable.

Request

Can anything be done to improve this? Specifically:

  1. Documentation Improvements:

    • Clarify if/when addStream is used internally by methods like write and flush.
    • Provide detailed explanations and examples on handling errors related to these methods.
    • Explain the impact of OS-level differences on flush behavior.
  2. API Enhancements:

    • Improve the Socket and SecureSocket APIs to handle internal state changes more gracefully, avoiding unexpected errors.
    • Consider adding safe guards or built-in retry mechanisms for write and flush operations.
  3. Practical Examples:

    • Provide code examples demonstrating safe and reliable ways to use write and flush in various scenarios.
    • Suggest patterns or best practices for dealing with these errors in production code.

Would rather I wouldn't have to do something like:

Future<void> _tryAgain(Socket s, String mes) async {
  try {
    s.write(mes);
    return;
  } catch (_) {
    await Future.delayed(const Duration(milliseconds: 10));
    return _tryAgain(s, mes);
  }
}

General info

OS

macOS Sonoma 14.5 (23F79)

Thank you for your attention.

lrhn commented 3 months ago

In general, it seems you are not allowed to call write while a flush is in progress.

Here you are not awaiting the flush, so the following write happens while the flust is in progress:

          // socket.write('Hello');
          // socket.flush();
          // socket.write('Goodbye');

Here you are scheduling a write to happen later, then calling flush, which means the former write happens while the flust is in progress.

          // scheduleMicrotask(() => socket.write('Hello'));
          // await socket.flush();
          // scheduleMicrotask(() => socket.write('Goodbye'));

So at least it's consistent. I agree that it's not documented behavior.

larssn commented 3 months ago

It's deliberate that I'm not awaiting flush. If multiple chunks of async code is using the same socket, awaiting a flush isn't always possible. Sure you can await it, but code in a different location might try to write at the same instance.

And these two examples are just the ones I was able to conjure up, I've seen more weird crashes in our production environment.

Basically it feels very unsafe, to call either write or flush.