raspberrypi / pico-sdk

BSD 3-Clause "New" or "Revised" License
3.3k stars 853 forks source link

Support stdio drivers with blocking character input #1020

Open alastairpatrick opened 1 year ago

alastairpatrick commented 1 year ago

The signature of stdio_driver_t::in_chars is:

int (*in_chars)(char *buf, int len)

I would like to add another entry point to stdio_driver_t:

int (*in_chars_timeout_us)(char *buf, int len, uint64_t until)

The reason is, the former makes it awkward to implement a blocking stdio driver for an RTOS. Here is what I would like to write for qOS:

int in_chars_timeout_us(char *buf, int len, uint64_t until) {
  int i = 0;
  for (; i < len; ++i) {
    while (uart0_hw->fr & UART_UARTFR_RXFE_BITS) {
      if (!qos_await_irq(UART0_IRQ, &uart0_hw->imsc, UART_UARTIMSC_RXIM_BITS | UART_UARTIMSC_RTIM_BITS, until)) {
        goto timeout;
      }
    }
    buf[i] = uart0_hw->dr;
  }

timeout:
  return i ? i : PICO_ERROR_NO_DATA;
}

Without the added until parameter, the options are to have qos_await_irq() not block, in which case stdio input busy waits or to have qos_await_irq() timeout after an arbitrary amount of time, such as one second, which reduces the precision of the timeout passed to getchar_timeout_us().

alastairpatrick commented 1 year ago

I have an alternative suggestion, which I think is more satisfying. Instead of adding in_chars_timeout_us, add this new entry point:

void (*await_char_readable)(absolute_time_t until);

The contract is, it must return if/when a character is readable, it should return after timeout and it may return spuriously. An RTOS might implement it like this:

void await_char_readable(absolute_time_t until) {
  qos_await_irq(UART0_IRQ, &uart0_hw->imsc, UART_UARTIMSC_RXIM_BITS | UART_UARTIMSC_RTIM_BITS, until);
}

A busy waiting driver might implement it like this:

void await_char_readable(absolute_time_t until) {
  // we sleep here in case the in_chars methods acquire mutexes or disable IRQs and
  // potentially starve out what they are waiting on (have seen this with USB)
  busy_wait_us(1);
}

And this is the default implementation if the entry point is null. I've moved the busy_wait_us call out of stdio_get_until() into await_char_readable() because it is unnecessary and undesirable to busy wait if there's an RTOS.

This also solves the problem of what to do if there are >1 drivers; call await_char_readable() only if there is one driver and fall back on busy waiting if there are >1. I would rather do something better for the >1 case though.

alastairpatrick commented 1 year ago

An idea for making the >1 driver case better... If there is only one driver and it implements await_char_readable() then do things same as above. If there are >1 drivers or if the only driver does not implement await_char_readable(), then have stdio_get_until() call a macro pico_yield() in place of busy_wait_us(1). Something like:

static int stdio_get_until(char *buf, int len, absolute_time_t until) {
    int num_drivers = 0;
    bool should_yield = false;
    for (stdio_driver_t *driver = drivers; driver; driver = driver->next) {
        if (filter && filter != driver) continue;
        if (!driver->in_chars) continue;
        if (!driver->await_char_readable)
            should_yield = true;
        ++num_drivers;
    }
    if (num_drivers > 1)
        should_yield = true;

    do {
        for (stdio_driver_t *driver = drivers; driver; driver = driver->next) {
            if (filter && filter != driver) continue;

            if (!should_yield)
                driver->await_char_readable(until);

            if (driver->in_chars) {
                int read = driver->in_chars(buf, len);
                if (read > 0)
                    return read;
            }
        }
        if (time_reached(until))
            return PICO_ERROR_TIMEOUT;

        if (should_yield)
            pico_yield();
    } while (true);
}

By default, pico_yield() just calls busy_wait_us(1) as before:

#define pico_yield() busy_wait_us(1)

But, similar to the lock_internal macros, an RTOS may define it to something appropriate to prevent lower priority tasks from starving:

#define pico_yield() qos_yield()

pico_yield() could potentially find other uses in the SDK.

With these things done, if there is only one driver, an RTOS could wait for stdio input while in sleep mode. With >1 driver, it wouldn't be able to sleep while waiting for input but at least lower priority tasks wouldn't starve.

alastairpatrick commented 1 year ago

This is closely related to issue #547, which suggests this is also a problem for FreeRTOS.

alastairpatrick commented 1 year ago

Two desirable outcomes that I haven't considered so far are: 1) It would be good if vanilla SDK without an RTOS could sleep while waiting for stdio input. 2) It would be good if an RTOS didn't have to implement a whole new stdio driver to make blocking work.

A blocking stdio UART driver for the SDK might look like this:

#define stdio_yield(delay) busy_wait_us(delay)
#define stdio_wait(until) best_effort_wfe_or_timeout(until)
#define stdio_signal() __sev();
#define stdio_signal_from_isr() __sev();

// stdio.c
static int stdio_get_until(char *buf, int len, absolute_time_t until) {
    do {
        bool can_block = __get_current_exception() == 0;
        for (stdio_driver_t *driver = drivers; driver; driver = driver->next) {
            if (filter && filter != driver) continue;
            if (driver->in_chars) {
                int read = driver->in_chars(buf, len);
                if (read > 0) {
                    return read;
                }
            }
            if (!driver->prepare_to_block) {
              can_block = false;
            }
        }
        if (time_reached(until)) {
            return PICO_ERROR_TIMEOUT;
        }
        if (can_block) {
          for (stdio_driver_t *driver = drivers; driver; driver = driver->next) {
              if (filter && filter != driver) continue;
              driver->prepare_to_block();
          }
          stdio_wait(until);
        } else {
          stdio_yield(1);
        }
    } while (true);
}

// stdio_uart.c
static void uart_prepare_to_block() {
  hw_set_bits(&g_uart_hw->imsc, UART_UARTIMSC_RXIM_BITS | UART_UARTIMSC_RTIM_BITS);
}

static void shared_uart_interrupt_handler() {
  hw_clear_bits(&g_uart_hw->imsc, UART_UARTIMSC_RXIM_BITS | UART_UARTIMSC_RTIM_BITS);
  stdio_signal_from_isr();
}

This must not be a breaking change, so add a function to opt-in to blocking mode. Caller agrees that it ought not to install an exclusive UART IRQ handler on this core or any UART IRQ handler on the other core or mess with the UART IRQ NVIC registers.

void stdio_uart_init_blocking(int irq_priority) {
  irq_add_shared_handler(g_uart_irq, shared_uart_interrupt_handler);
  irq_set_priority(g_uart_irq, irq_priority);
  irq_set_enabled(g_uart_irq, true);

  // Was null before to indicate to stdio_get_until() that blocking wasn't supported.
  g_driver.prepare_to_block = uart_prepare_to_block;
}

An RTOS replaces these macros. All stdio drivers share the same the same event synchronization object to allow blocking even when there are several drivers.

extern qos_event_t g_stdio_event;
#define stdio_yield(delay) qos_yield()
#define stdio_wait(until) qos_await_event(&g_stdio_event, qos_from_absolute_time(until));
#define stdio_signal() qos_signal_event(&g_stdio_event);
#define stdio_signal_from_isr() qos_signal_event_from_isr(&g_stdio_event);

FreeRTOS might do something similar with xEventGroupSetBits(), xEventGroupSetBitsFromISR() and xEventGroupWaitBits();

Then with changes along these lines working, hopefully: 1) When an RTOS is in use, tasks don't starve while a higher priority task awaits stdio input. 2) If all drivers support blocking, stdio_get_until() can sleep while waiting for stdio input, regardless of whether an RTOS is in use. 3) RTOSes can reuse the SDK's UART driver rather than implement their own. Not sure about the USB one yet.

alastairpatrick commented 1 year ago

Here's a prototype of above scheme that seems to kind of work: https://github.com/alastairpatrick/pico-sdk/commit/8a5e920ca943385734d3227648c44c4f2f12f494. I also made stdio_uart_out_chars() potentially blocking since RTOSes might want to run lower priority tasks while the UART TX FIFO empties.