micropython / micropython

MicroPython - a lean and efficient Python implementation for microcontrollers and constrained systems
https://micropython.org
Other
18.72k stars 7.48k forks source link

esp32: time.sleep_us() doesn't check for pending events #3493

Closed bboser closed 6 years ago

bboser commented 6 years ago

The program below measures ESP-32 interrupt delay. Two pins are connected by a wire, with the destination detecting a rising edge from the source via interrupt.

After applying the rising edge on the source, the program sleeps ~1000us, sufficiently long for the interrupt to occur.

Problem: if the "sleep" is implemented with a for loop, everything works as expected. But with time.sleep_us(1000), the interrupt never occurs!

sleep_us is implemented in mphalport.c and calls IDF function ets_delay_us(us). I've tried replacing this function with my own C-code calling gettimeofday until the delay has expired. I've even added vTaskDelay(0) to the loop - to no avail.

Apparently the micropython interpreter does something different when it executes a (busy wait) for loop that does not disable interrupts. Anyone knows what this might be? If it's callable from C I'd like to try if I can implement a time.sleep_us that does not mess with interrupts.

` from board import A20, A21 from machine import Pin from time import ticks_us, ticks_diff from time import sleep_us

""" IRQ delay test Pins p_src and p_dst are connected Pin p_src generates a rising edge Pin p_dst irq waits for rising edge (from p_src) ticks_us is used to measure the ets_delay_us

The test code below sets up the rising edge, "sleeps" for ~ 1ms, and calculates the irq delay.

PROBLEM: 1) If sleep is implemented as busy wait (simple for loop), everything works as expected, typical irq delays are 100 ... 400us, mean ~ 120us. 2) With time.sleep_us(1000) interrupts are never received and the test fails.

Apparently time.sleep_us(...) disables interrupts?!? """ p_src = Pin(A21, mode=Pin.OUT) p_dst = Pin(A20, mode=Pin.IN)

t_start = 0 t_stop = 0

def handler(pin): global t_stop

record time when interrupt was received

t_stop = ticks_us()

p_dst.irq(handler, trigger=Pin.IRQ_RISING)

d_min = 10000 d_max = 0 d_sum = 0 N = 500

for _ in range(N): p_src(0)

record time of rising edge

t_start = ticks_us()
# rising edge
p_src(1)
if True:
    # busy wait
    for _ in range(1500): pass
else:
    # os wait
    sleep_us(1000)
# calculate irq delay and statistics
dt = ticks_diff(t_stop, t_start)
d_min = min(d_min, dt)
d_max = max(d_max, dt)
d_sum += dt

print("{} tests, min={}us, max={}us, avg={:6.1f}us".format(N, d_min, d_max, d_sum/N)) `

bboser commented 6 years ago

Code as attachment. Sorry the formatting in the previous message is off. Not .txt extension, .py not accepted.

irq.txt

dpgeorge commented 6 years ago

Thanks for the report. The issue is not that time.sleep_us() disables interrupts, but rather that it doesn't allow (soft) interrupts to be processed until it finishes.

The test above will work if the sleep_us(1000) is replaced by time.sleep_ms(11). This is because a millisecond sleep larger than 10ms will check for pending (soft) interrupts during the sleep. But a shorter ms sleep won't, neither will any us sleep. The reason for the 10ms value is because the FreeRTOS tick is 10ms, and that's the granularity for the check for (soft) pending interrupts.

The for loop delay works because it checks for pending interrupts every loop of the for.

A soft interrupt is when a Python callback is scheduled by a hard IRQ (in the case above a GPIO change IRQ), and it's scheduled to be executed "later" but "as soon as possible". uPy has various checks here and there to check for pending soft interrupts and to run them, but they aren't checked everywhere so there is some delay in processing them.

A few ways to fix this issue would be: 1) implement hard pin interrupts on the esp32 to enable faster response times 2) check for pending soft interrupts in sleep_us(); this would need to be done in a smart way so as not to degrade performance of sleep_us() for small arguments

bboser commented 6 years ago

@dpgeorge Many thanks for the quick and enlightening response!

Now I have two questions:

  1. How do I check for pending interrupts from C? (for a version of sleep).
  2. Can we use the heap in "soft interrupts" - basically the only ones we get at this point on the ESP32 with MicroPython?

Long term the solution for real time on the ESP32 are probably hard interrupts and no RTOS. Or switch to hardware that supports this. But now I need a quick fix for my class which starts in January.

bboser commented 6 years ago

Shown below is a simple measurement to assess the accuracy of time.sleep(us). I am running a loop that toggles a pin with 1000us sleep between. The upper trace, triggered by the rising edge, shows the 1ms high period (a bit longer, presumably because of the micropython calling overhead). The lower trace zooms into the falling edge to see the timing variation. With 1s persistence, most transitions occur in a 10us interval, but one is almost 20us delayed. With longer persistence some transitions are off the screen, i.e. occur more than 30us after the earliest.

I hope to write a modified sleep_us that has no (not much) worse behavior but still services soft interrupts (then presumably incurring much longer delays). In code that requires accurate timing I just turn off interrupts. Then I know what I am getting (and paying).

scope_screen

robert-hh commented 6 years ago

If you look at the way the esp32 connects program memory, you will learn that you will not get reliable timing with software. The program memory is located in a serial flash, which is cached in RAM. So depending on the program flow, there will be cache misses, when code has to be read from flash. That takes about 300µs. The fact that you see so small jitter is most likely caused by your code being small and mostly in cache. I had made a similar test where an ISR was triggered by an external signal. https://forum.pycom.io/topic/936/pin-interrupt-latency The device is a pycom LoPy, which is an esp32 with and external LoRa transmitter, which was disabled for that test. Similar for an esp8266, which has the same memory architecture: https://github.com/micropython/micropython/issues/2972

bboser commented 6 years ago

I did see your results, with dismay. They point to the ESP32 just not being suitable for real time applications. Unfortunately for the class I am teaching I am constrained by cost and the need for wifi connectivity. Not many alternatives there, especially with microphython.

Now I am trying do get something where I can at least demonstrate the concept of an interrupt, even if the performance is poor. Same goes for the ADCs - with 25 LSB INL it really should be called an 8-Bit ADC, if that, with astronomical offset (https://github.com/bboser/IoT49/blob/master/doc/analog_io.md).

The present implementation of sleep_us produces almost random results. Adding a simple if-statement to my test code resulted in most interrupts suddenly being serviced, even with sleep_us(1000). Presumably the additional delay from the if-statement was sufficient for most interrupts to be scheduled before the call to sleep_us. Now explain such behavior to a group of undergraduates who want to have fun with electronics, not frustration.

robert-hh commented 6 years ago

The ESP32 is like the ESP8266 before a device with a lot of features, but none of them really good, except for the wireless connectivity. It is still too obvious, that it started as programmable WiFi modem. But I think for a class project this is still a good device, at low cost, very versatile and still for usable for hardware control and sensor tasks. You test of the ADC is a very detailed and interesting. I'll keep that link. I have considered it once for a project, but after a few trials I moved to an ADS1115. It is simply too noisy. Espressif promised to improve it with the new revision, but I did not check it yet. The Wemos LOLIN32 lite board comes with a revision 1 chip, as well as the Wemos LOLIN32 pro.

bboser commented 6 years ago

All this said, does anyone know how to check for soft interrupts from C?

dpgeorge commented 6 years ago

how to check for soft interrupts from C?

Call mp_handle_pending()

robert-hh commented 6 years ago

@bboser You prepared a very fine compilation of material here: https://github.com/bboser/IoT49, helpful for others too here on the board which might have to hold a class. Just one question of the ADC you made: which voltage source did you use to run the ADC tests, which were most likely to some extend automated?

bboser commented 6 years ago

@robert-hh I used the ADS1115 (and your driver, thanks!) and mcp4725 for testing. The errors of those components are included in the measurements, but negligible compared to those of the ESP32.

The test code ran on the ESP32 itself, except a the plotting software. Communication to the plot server (a 1 page program) was by MQTT. A couple of hours of programming with Python, would have been a few weeks with C ...

I'll teach this stuff in my course and over the coming semester will add it to the website.

bboser commented 6 years ago

@dpgeorge Here's the glorious "fixed" time.sleep_us. Code and measurements are attached.

Summary: Problem: soft interrupts not serviced during sleep_us

Fix: call mp_handle_pending() in sleep_us

Implementation: change mphalport.c replace ets_delay_us in mp_hal_delay_us by mp_hal_busy_delay_us and mp_hal_delay_ms

Directories ets_delay_us and mp_hal_busy_delay_us contain test results before and after the change.

Scope screen shots are for code in sleep_us.py Filename is value of dt insleep_us.py.

Result: New code is is accurate for nominal delays >10us (calling overhead) and services soft interrupts, as desired (ignoring jitter). The accuracy is better than the existing code which does not remove its calling overhead (~30us).

Limitation: Code tested with 240MHz clock on Rev 1 silicon. Delays may be inaccurate for other clock speeds. Different compiler optimization settings could also result in (small?) errors.

Note: I expected delay quantization from the execution speed of the while loop in mp_hal_busy_delay_us but did not observe this in testing. Presumably the code in the while loop is fast.

mp_hal_busy_delay_us, replacing call to ets_delay_us:

void mp_hal_busy_delay_us(uint32_t us) {
    // these constants are for 240MHz clock
    const uint32_t ets_overhead  =  10;    // minimum delay calling ets_delay_us
    const uint32_t mhp_overhead  =  80;    // minimum delay of code past ets_delay_us
    const uint32_t mhp_threshold = 110;    // use ets_delay_us for shorter delays
    // return if requested delay is less than calling overhead
    if (us < ets_overhead) return;
    // for short delays remove calling overhead and use ets_delay_us
    if (us < mhp_threshold) {
        ets_delay_us(us-ets_overhead);
        return;
    }
    struct timeval tv;
    gettimeofday(&tv, NULL);
    uint64_t tstop = tv.tv_sec * 1000000 + tv.tv_usec + us - mhp_overhead;
    uint64_t tnow;
    do {
        // service interrupts
        mp_handle_pending();
        // determine current time to assess if we slept sufficiently long
        gettimeofday(&tv, NULL);
        tnow = tv.tv_sec * 1000000 + tv.tv_usec;
    } while (tnow < tstop);
}

esp32-sleep_us.zip

bboser commented 6 years ago

Never mind what's below - the timing variation occurs in calc (before toggling the pin). Conclusion: machine.Timer is very accurate!

Ignore all below.

Before closing this issue, I want to report a related timing error (?) concerning machine.Timer.

The code below runs a timer callback periodically (15ms in the example) and measures the period error in us using time.ticks_us. Errors > 100us are printed; however this never happens.

This made me suspicious. The scope trace below shows the 15ms period, and the zoomed trace the errors. Clearly, occasionally errors exceeding even 1000us occur (this is with infinite persistence, rather rare events)

What's wrong: my code, the oscilloscope, ticks_us?

BTW: I use print and memory allocation on the timer thread - I assume that's ok since the ESP32 uses "soft interrupts" - correct me if I am wrong.

scope_screen

from machine import Timer, Pin
from time import ticks_us, sleep_ms, ticks_diff
import gc
from board import A19

period = 15  # period [ms]

pin = Pin(A19, mode=Pin.OUT)
last_free = gc.mem_free()
iteration = 0
last_us = None

def calc():
    """Called during each iteration of the timer.
    Simulates code run by timer in a real application.
    """
    global last_free, iteration, pin
    mf = gc.mem_free()
    if last_free < mf:
        print("ran gc, iteration=", iteration)
    last_free = mf
    # allocate memory
    x = "abc" + str(iteration)
    pin(not pin())
    sleep_ms(2)

def timer_cb(timer):
    global last_us, period, iteration
    now_us = ticks_us()
    if last_us:
        error_us = ticks_diff(now_us, last_us) - 1000*period
        if abs(error_us) > 100: print("error > 100us:", error_us)
        iteration += 1
    last_us = now_us
    calc()  // moving this ahead of now_us = ... fixes my logical error

try:
    timer = Timer(1)
    timer.init(period=period, mode=Timer.PERIODIC, callback=timer_cb)
    while True: sleep_ms(1)
finally:
    timer.deinit()
robert-hh commented 6 years ago

I tried to replicate the test on a Wemos LOLIN32 Lite, (esp32, rev1), and get frequent error reports. Here a short list:

MicroPython v1.9.3-188-gd1fd889-dirty on 2017-12-18; ESP32 module with ESP32
Type "help()" for more information.
>>> import timer_15ms
error > 100us: -336 0
error > 100us: 390 1
error > 100us: 174 2
error > 100us: -650 4
error > 100us: 298 6
error > 100us: -128 7
error > 100us: 757 8
error > 100us: -879 9
error > 100us: 266 11
error > 100us: -117 12
error > 100us: 152 14
error > 100us: 555 15
error > 100us: 149 16
error > 100us: -102 17
error > 100us: -728 19
error > 100us: 565 21
error > 100us: -479 22
error > 100us: 296 24
error > 100us: -134 25
error > 100us: 287 27
error > 100us: 532 28
error > 100us: -486 29
error > 100us: 295 31
error > 100us: -540 32
error > 100us: 603 33
error > 100us: -436 35

I added the value of iteration in the printout. I'm using the micropython.org version of the firmware, not the loboris version.

uos.uname()
(sysname='esp32', nodename='esp32', release='1.9.3', version='v1.9.3-188-gd1fd889-dirty on 2017-12-18', machine='ESP32 module with ESP32')

B.T.W.: There is a typo in line 35 of the script. The comment char should be '#'. And yes, I removed the "import board" and selected GPIO16.

bboser commented 6 years ago

@robert-hh thank you for looking into this. I did not run wifi for the test, and perhaps also get different results with the Loboris firmware (I'd like to stick with "standard" python, but at present the port lacks just too many features - telnet, deepsleep, rtc sync).

Ran into a new puzzle with the above code - with dt (delay) > 26ms, the callback is never called. If you have a chance, I'd appreciate if you could check (e.g. dt = 100 ms) to see if it's my setup. I looked at the timer code, and this makes no sense: esp-32 timers are 64Bits and with 1ms resolution won't overflow in my lifetime. Strange.

robert-hh commented 6 years ago

@bboser I repeated the test with your settings and both the standard and loboris firmware. With the standard firmware, time of 100ms run. with the usual jitter, which is smaller when WiFi is disabled. Funny enough, it looks as there is a preference for shortening the time. With the loboris firmware more interesting things happen. It may by, at 15 ms timer setting, that there is jitter for a while, until the gc.collect() is called by your code, then for a while there is no jitter at all, and then after another gc.collect(), it starts jittering again. Below is the copy of a few log lines.

error > 100us: -996 816
error > 100us: 1004 822
error > 100us: -1004 823
error > 100us: 996 829
error > 100us: -996 830
error > 100us: 1004 836
error > 100us: -1004 837
error > 100us: 996 843
error > 100us: -996 844
ran gc, iteration= 846
ran gc, iteration= 1940
ran gc, iteration= 3034
ran gc, iteration= 4128
ran gc, iteration= 5221
error > 100us: 1000 5221
error > 100us: -1000 5222
error > 100us: 1004 5228
error > 100us: -1004 5229

At a timer setting of 100ms, nothing happens any more. No output at the Pin, no message by the program. The limit is between 26 and 27 ms. Juggling with numbers, at 160 MHz there are 4320000000 clock cycles within 27 ms, which is about 2^32. So there may be a 32/64 bit problem.

dpgeorge commented 6 years ago

This issue with time.sleep_us() not checking for pending events was fixed in 4ed586528047d3eced28a9f4af11dbbe64fa99bb . Thanks to @bboser for the initial implementation; the actual fix is slightly different but follows the same ideas.