project-improv / improv

Adaptive Platform for Neuroscience Experiments
https://project-improv.github.io
MIT License
38 stars 12 forks source link

Evaluate removing asyncio from Nexus (but not improv as a whole) #191

Open rwschonberg opened 2 months ago

rwschonberg commented 2 months ago

Since ZMQ provides the abstraction of an asynchronous message queue with the interface of a socket, moving to ZMQ gives us the ability to use synchronous functions to perform asynchronous I/O without the need for asyncio. As a result, we should evaluate whether the complexity introduced by having a dependency on asyncio in Nexus gives us the benefits we're looking for.

Previously, we used asyncio to:

  1. poll inbound messages from actors and the TUI.
    1. Both of these types of task have a few things in common:
      1. They are not particularly performance-critical. That is, a delay of a few milliseconds in nexus processing a "setup" command from the UI should be acceptable (indeed, because task processing is almost entirely synchronous, the possibility for this delay is already present in Nexus)
      2. They move a small amount of data (from the TUI, a string of just a single word as a command)
      3. We do not expect Nexus to be handling many messages - its event loop exists to register actor state at startup and process commands from the UI.
      4. Most of their processing time is not actually spent on I/O, but in the compute we perform to process the small chunk of received data.

A few observations:

  1. 1.i.b. and 1.i.d. above imply that tasks as currently employed by Nexus are likely not I/O-bound, which is one of the main uses for asyncio according to the documentation
  2. Processing of tasks is predominantly synchronous within the asyncio tasks. There is only one call to asyncio.wait - almost all of the rest of the await statements are the first (e.g. to recv_multipart on a zmq socket), and only, non-synchronous part of processing an event.
    1. The processing flow in the event loop can be summarized as "wait for a message -> process the message synchronously without yielding the loop to any other task")
  3. The new zmq design collates all inbound update messages from actors to a single REQ/REP socket in nexus. Thus, there are only two tasks that are polled in the event loop.
  4. Tear-down of nexus can be accomplished in two ways - one is entered synchronously, and one is entered asynchronously. As a result, there are some rough edges in trying to handle shutdown in both paths together with appropriate code reuse.

Given the above observations, removal of asyncio from nexus would not introduce any adverse side effects, but would have the net benefit of simplifying the codebase with future gains in maintainability.

rwschonberg commented 2 months ago

cc @jmxpearson and @draelos!