d-markey / squadron

Multithreading and worker thread pool for Dart / Flutter, to offload CPU-bound and heavy I/O tasks to Isolate or Web Worker threads.
https://pub.dev/packages/squadron
MIT License
79 stars 0 forks source link

Event loop is stuck on await #36

Closed sabin26 closed 3 months ago

sabin26 commented 3 months ago

Hi, I have a simple squadron class.

@SquadronService(baseUrl: '/services/websocket')
class WebSocketService {
  WebSocketService({required this.url}): _client = WebSocketClient(url: url);

  final String url;
  final WebSocketClient _client;

  @SquadronMethod()
  Future<void> connect() async {
    ...
  }

  @SquadronMethod()
  Future<void> disconnect() async {
    ...
  }

  @SquadronMethod()
  Future<void> closeConnection() async {
    print('I am closing connection');
    await _client.close();
    print('I am returning');
  }
}

Everything works great! All Thanks to your wonderful package.

I am trying to make sure that the WebSocket client is closed when the mobile app is detached.

appLifecycleListener = AppLifecycleListener(
      onDetach: () async {
        print('I am disposing');
        final wsPool = ...; // This workerPool only has at most 1 worker 
        await wsPool.closeConnection();
        print('I am successful');
      },
);

Logs:

I am disposing
I am closing connection
I am returning

I have made sure to handle all the errors properly. I do not get any errors. App runs and closes perfectly. But the print message 'I am successful' is not called. The event loop is stuck on awaiting wsPool.closeConnection() event though the closeConnection() method is already completed.

This is the simplification of my issue.

The original issue I am facing is that I am closing the connection and stopping the worker pool. Since closing the connection is asynchronous and stopping the worker pool is synchronous, the event loop first stops the worker pool and then closes the connection which is an error (await keyword is missing). But I cannot add await else the event loop is stuck as seen above. I have tried to stop the worker pool asynchronously to make sure closeConnection is called first before stopping the worker pool as below:

Future.delayed(Duration.zero, () {
   // Doesn't get called if the code below or above this Future has wsPool.closeConnection(); 
    wsPool.stop();
  });

Future.delayed(const Duration(seconds: 1), () {
   // Doesn't get called if the code below or above this Future has wsPool.closeConnection(); 
    wsPool.stop();
});

Timer(Duration(seconds: 1), () {
   // Doesn't get called if the code below or above this Future has wsPool.closeConnection(); 
    wsPool.stop();
});

I have spent the whole day fighting with this bug but I am still unable to find the root cause of it.

sabin26 commented 3 months ago

I thought maybe the main thread going away when the app is detached makes it be unreceivable from the worker thread. But we can call wsPool.closeConnection() from the main thread and this makes it unlikely.

sabin26 commented 3 months ago

I made sure to cancel all pending tasks by calling wsPool.cancel() before closing the connection. Assuming wsPool.cancel() does what I think it does it should cancel all the active tasks which only leaves the worker with closing the connection soon after. So, the event loop shouldn't really hung up :(

sabin26 commented 3 months ago

I am able to call other async functions and await them inside the onDetach method of AppLifecycleListener but couldn't get past await wsPool.closeConnection().

sabin26 commented 3 months ago

Its not specific to wsPool.closeConnection() as I am unable to await other squadron methods too. They are invoked but the await hangs.

sabin26 commented 3 months ago

One workaround I tried is closing the connection without awaiting or stopping the worker pool. Since the worker pool has a concurrency of minimum 1 worker and maximum of 1 worker, I thought the worker pool will automatically stop. It feels like it stops looking at the call stack which is empty once the app is detached.

But, once the app is opened again after being detached, the last worker that should have been removed appears on the stack meaning it got restored somehow. Here's the call stack after reopening the app:

websocket_close_bug_squadron

sabin26 commented 3 months ago

Update:

  1. Using a worker instead of worker pool works as expected and no need for workarounds.
  2. If using a worker pool, here is the workaround:
wsPool.closeConnection(); // don't use await here
await wsPool.scheduleTask((final worker) async { wsPool.stop() }); // This works as expected.

The original issue was that the below code didn't work as expected:

wsPool.closeConnection(); // Cannot close connection because the worker is already stopped.
wsPool.stop(); // This runs first as this is synchronous.
// OR,
await wsPool.closeConnection(); // This invokes the method in the worker but doesn't complete in main thread.
wsPool.stop(); // This is never called because the event loop is stuck on above line.

Closing this as who knows this might be an application-level bug. Kindly re-open and investigate if you think this might be a bug in Squadron :)

d-markey commented 2 months ago

Hello,

thanks for your detailed feedback and progress on solving this issue, and sorry for the late reply! Indeed using a single worker instead of a worker pool with a 0:1 or 1:1 concurrency the recommended approach.

With the worker pool and a min concurrency > 0, I'm actually not sure how cancel() and stop() behave.

Please note that contrary to workers, which will reject further work requests after they've been stopped, pools will accept new work requests after stop() was called and will spawn new workers if necessary, following any supplied concurrency settings.