Async RMT Tx not working with >64 PulseCodes

akriese commented 3 weeks ago

I was switching my project from manual interrupt handling to using embassy. In the process I also switched from blocking to async RMT. Before, I was sending about 150*24=3000 PulseCodes on each transmission without issues to control an LED strip. With the async implementation, the strip wasnt getting any input apparently. I used parts of the embassy_rmt example to reproduce this and check for which array sizes the RMT would stop working completely. At 65 PulseCodes, my example stopped working, so 64 seems to be the sweet spot.

This makes some sense as the RAM blocks for each RMT channel hold up to 64 data points according to the ESP32 documentation. So, I guess, something which the Blocking implementation is doing right is going wrong on the Async side. I haven't had the time to look into the implementation further. Maybe someone has an insight on this?

Here are some snippets of the program:

#[main]
async fn main(spawner: Spawner) {
    let peripherals = Peripherals::take();
    let system = SystemControl::new(peripherals.SYSTEM);
    let clocks = ClockControl::max(system.clock_control).freeze();

    esp_println::logger::init_logger_from_env();

    let timg0 = TimerGroup::new_async(peripherals.TIMG0, &clocks);
    esp_hal_embassy::init(&clocks, timg0);

    let io = Io::new(peripherals.GPIO, peripherals.IO_MUX);

    let channel = transmit::init_rmt(peripherals.RMT, io.pins.gpio26, &clocks);

    spawner.spawn(test_rmt(channel)).ok();
}

pub fn init_rmt<'d, P: esp_hal::gpio::OutputPin>(
    rmt: impl Peripheral<P = RMT> + 'd,
    pin: impl Peripheral<P = P> + 'd,
    clocks: &Clocks,
) -> Channel<Async, 0> {
    let _rmt = Rmt::new_async(rmt, HertzU32::MHz(80), clocks).unwrap();
    _rmt.channel0
        .configure(
            pin,
            TxChannelConfig {
                clk_divider: 1,
                idle_output_level: false,
                idle_output: false,
                carrier_modulation: false,
                carrier_high: 1,
                carrier_low: 1,
                carrier_level: false,
            },
        )
        .unwrap()
}

#[embassy_executor::task]
async fn test_rmt(mut channel: Channel<Async, 0>) {
    loop {
        let on = 10000 / NS_PER_CLOCK_CYCLE;
        let off = 10000 / NS_PER_CLOCK_CYCLE;
        let mut sequence = [PulseCode {
            level1: true,
            length1: on,
            level2: false,
            length2: off,
        }; 65];

        sequence.last_mut().unwrap().length2 = 0;

        channel.transmit(&sequence).await.ok();

        Timer::after_secs(1).await;
    }
}

akriese commented 3 weeks ago

So, I dug a bit into the rmt.rs code and noticed, that the Blocking implementation calls send_raw() once and then calls the remaining necessary send_raw() calls in SingleShotTxTransaction::wait(). Meanwhile, the Async implementation throws an error (that I seemed to simply ignore earlier) when the data exceeds the RMT RAM size (which is also stated by the short docstring of the transmit() function of the Async impl). Couldn't they both use the same approach or is that somehow not supportable with Async? Sending my big array in async transmission chunks did not solve the problem as there is probably too much time between the calls. This was the code that I tried (but did not work):

    let cycles = sequence.len() / 64;
    for i in 0..cycles {
        channel
            .transmit(sequence[i * 64..(i + 1) * 64])
            .await
            .unwrap();
    }
    channel
        .transmit(sequence[cycles * 64..])
        .await
        .unwrap();

akriese commented 3 weeks ago

Couldn't one do an async wait for the RMT channel being ready to send the next chunk? In the Blocking implementation, this is done by checking the TX-THR register. Surely, there is an interrupt connected to that so that the Async implementation could wait for that and do other stuff in the meantime, right?

brainstorm commented 3 weeks ago

Thanks for filing this! Closely related to my current musings in https://github.com/esp-rs/esp-hal/issues/1710

tschundler commented 3 weeks ago

I've been digging into this as well in #1768. It seems the 64 symbol limit was acknowledged in the original commit in #787

I can send unlimited symbols with fixed memory usage - no need for any buffers - https://github.com/esp-rs/esp-hal/compare/main...tschundler:esp-hal:async-led-iterator

I noticed the async implementation doesn't read any more from the input buffer / iterator. This weekend I'm trying to extend my changes to the async trait also also to the smart LEDs hal.

Trying without interrupts, unless the content is less than 64 symbols, it doesn't start sending until I press the reset button. When using interrupts, the poll seems to only ever be triggered once and not again, so it works, but only if I'm sending something within the size of one buffer. I can't seem to figure out how to correctly reset it.

tschundler commented 3 weeks ago

Spending most of today fussing with this, I can only seem to get it to work without interrupts, just forcing polling by having it call ctx.waker().wake_by_ref() before returning Pending. You can try pulling from my repo and seeing how it works for you.

It seems at least for LED purposes there isn't enough time to refill the buffer with the overhead restarting the task after an interrupt. There should be time for ~7,000 instructions to run. By adding some panics, often it is time to refill the buffer on the first time the poll is called. (I'm building with --release) That is quite a bit of overhead.

Also I noticed when I call in a loop, sometimes the first time has an underrun, while subsequent runs are OK

Perhaps the RMT buffer needs to be refilled in the interrupt and the async routine refills a buffer that is read in the interrupt?

bjoernQ commented 1 week ago

It's a known limitation that the async implementation won't handle sending more pulse codes than the buffer can take: https://github.com/esp-rs/esp-hal/blob/5d6354ccbd62527452968074c4a92a4fc6c6a635/esp-hal/src/rmt.rs#L1173-L1175

https://github.com/esp-rs/esp-hal/blob/5d6354ccbd62527452968074c4a92a4fc6c6a635/esp-hal/src/rmt.rs#L1166-L1168

Refilling the buffer is very time critical - I'm not surprised it doesn't work via wakers. It might work with just an interrupt handler refilling the buffer and waking the waker once it's completely done.

Depending on the number of pulses sent and how often that is done it might be okay to use the blocking API - this will block other tasks from running until all the pulses are clocked out but depending on how time sensitive other tasks are that might be okay

bjoernQ commented 1 week ago

Removed the "bug" label since it's working as documented. However, it's certainly a valid feature request

esp-rs / esp-hal

Async RMT Tx not working with >64 PulseCodes #1779