foxsi / foxsi-4matter

Code for FOXSI-4 formatter.
https://foxsi.umn.edu/
1 stars 3 forks source link

Give `boost::asio::io_context` work to do on a timer #31

Closed KriSun95 closed 10 months ago

KriSun95 commented 1 year ago

I think this task is more important and more difficult than the others in this list.

Boost’s library boost::asio has powerful tools for asynchronous input-output operations. Asynchronous means a method does not block—it runs in the background and returns immediately so execution can continue. This is very handy for the Formatter—you can always listen in the background for uplink commands. When you get one, a function gets called to deal with it. This paradigm is in contrast to “busy-waiting”—running an infinite loop that checks for fresh data each cycle.

The recipe for setting up the io_context is simple. Look at the parts offoxsi-4matter/examples/commanding/commanding_example.cpp that involve boost::asio::io_context for reference. This is the basic idea:

#include <boost/asio.hpp>
int main() {
    boost::asio::io_context context;
    // some socket definitions that are instantiated using `context`
    // some async wrapper function definition that has access to the socket
    my_async_function();    // this requires io_context somewhere within
    context.run();  // this starts doing the work described by `my_async_function()`
}

If you look at a lot of the methods of TransportLayerMachine, you will see asynchronous function handlers (for example, TransportLayerMachine::recv_udp_fwd_tcp()). The general design is to asynchronously read data from a socket into a buffer, then bind (boost::bind()) a function to be called after an asynchronous read operation takes place. Then, when data is sent to the socket (from, for example, the SPMU-001), it gets stored in the buffer, then the callback function is executed.

Currently, anytime the GSE sends a command to the Formatter, an asynchronous method (TransportLayerMachine::recv_udp_fwd_tcp_cmd(), then TransportLayerMachine::handle_cmd()) immediately executes and deals with the command. For flight though, we want to add uplink commands to a queue, then act on them when we swing around to the appropriate system in the loop. We will loop through systems like this, and each step in the loop will run through these steps.

To enable this, there are two things you can do (in increasing order of complexity):

The test implementation

You can make a new main file that creates a boost::asio::io_context and defines a worker function. The worker can just be an infinite loop with a print statement and a delay inside or something. The idea is it just stands in for background stuff happening.

Then, inside main, try to use boost::asio::post(io_context, your_function) to add your worker function to the io_context as work. Call io_context.run() to do the work that has been posted to it.

Then try instead using io_context.run_until(time) to set a deadline. How is the exit handled? Are there issues in the worker function when it is killed? Is it possible to add a second task to io_context after it starts running? What is the timing cost of stopping and starting io_context?

The more integrated implementation

In foxsi-4matter/apps/main.cpp there is a working example of the Metronome class. I want to use this to manage timing of input/output activity for each onboard system. E.g., I receive data for a certain amount of time from CdTe detector 1, then cut it off and move on to talking to CMOS 2. You can combine the Metronome functionality with some new io_context run/stop/post operations to demonstrate this.(Note that from CMakeLists.txt, foxsi-4matter/apps/main.cpp builds to a target binary named formatter. You can run it from the main foxsi-4matter folder like this once built (it takes no additional arguments, yet):

./bin/formatter

The Metronome is instantiated using a total period duration, a lookup table for outer loop and inner loop periods by system, and a reference to an io_context. The method Metronome::tick() gets executed every time its timer ticks, with the tick times based on the lookup table you supply. Can you modify the Metronome::tick() method to start and stop your worker function activity appropriately on each ::tick() call?

If it’s helpful, in Parameters.h I define two enum classes: SUBSYSTEM_ORDER and STATE_ORDER. You can increment through each of these to loop through the states (things like send command, request data, receive data, idle) that will be done for each onboard system, and to loop through the onboard systems. These two types are also the keys to a dictionary-type object that stores the durations of each part of the loop. The loop is like this in pseudocode:

for every system {cdte1-4, cmos1-2, timepix, hk}
    for every state in {cmd, request, receive, idle}
        do the state- and system-dependent work.

From Thanasi's Google Doc

thanasipantazides commented 1 year ago

Boost's asio examples include ways to cancel running TCP and UDP exchanges. These involve two (or more) threads under a boost::asio::io_context:

  1. to do the actual transmit/receive work with a socket,
  2. one to run a timer and check if it expires. If it does, close the offending socket with boost::asio::ip::tcp::socket.close().

For the Formatter, we would want to quickly try to reopen the socket if it closes, since we have one socket connection to all onboard systems. We should tally how many communication timeouts we have for each detector system.

thanasipantazides commented 10 months ago

This is effectively addressed by df4bd41. Closing.