miketeachman / micropython-rotary

MicroPython module to read a rotary encoder.
MIT License
276 stars 56 forks source link

Locks up if a thread is run on core1: Pico Pi W #27

Open ayjaym opened 1 year ago

ayjaym commented 1 year ago

On Micropython 1.20 if you start a thread on Core1, even if that thread does nothing then when an IRQ is triggered from the rotary encoder on code running on Core0 the whole machine locks up e.g

on core 0

t = start_new_thread(EventManager.Start, (onKeyDown,onKeyUp)) and then some code after that to interact with the encoder using this library, which works perfectly if the thread isn't spawned first.

The thread on core 1 is intended as a class which would handle a touchscreen - I have commented out that code because the failure occurs without it. However the code does work fine on Core1. But then running anything on Core1 even the dummy code above, seems to cause the rotary encoder code to fail. (note: python indentation seems to get lost when I save the issue)

class EventManager: @staticmethod def Start(onKeyDown, onKeyUp): while True: a = 1

Problem occurs even if the while loop just has a call to time.sleep_ms(1000), it seems like if there is a process running on Core1 then IRQs fail.

miketeachman commented 1 year ago

I can reproduce the problem using v1.20 firmware using Pin object interrupts (which are also used in the Rotary library). I used your example to create a minimal test to reproduce the failure. The processor sometimes locks up when pin 16 is pulled low, thereby creating an interrupt. Note that lockups don't happen every time, typically only 1 in 3 times when the pin is pulled low. There are many Github issues and discussions about IRQs and how they run in a RP2 multi-threaded program, for example https://github.com/micropython/micropython/issues/10690. I didn't look deeply into these issues to know if you have discovered a new bug, or one of the existing issues already describes the failure you have found.

In any case, this bug needs to be fixed in RP2 firmware before the Rotary library will work in a multi threaded application.

import _thread
from machine import Pin
import time

def pin_callback(p):
    print('callback')

class EventManager:
    @staticmethod
    def Start(onKeyDown, onKeyUp):
        while True:
            a = 1
            time.sleep(1)
            print('core 1')

t = _thread.start_new_thread(EventManager.Start, (1,2))

p16 = Pin(16, Pin.IN, Pin.PULL_UP)
p16.irq(handler=pin_callback, trigger=Pin.IRQ_FALLING)
ayjaym commented 1 year ago

Thanks so much, I realised of course this was a general problem and not specific to your library. Frankly I'm very discouraged by the immaturity of multi-core support in Micropython. I have an open source (and open hardware project) which is working brilliantly hardware-wise - it's a fully user-programmable MIDI controller with an OLED display and a matrix of 16x8 RGB buttons with everything 3d printed. The basic hardware is working nicely, the Pico PI's multiple ADC channels made interfacing to the resistive touchscreen which acts as an underlay to the buttons very easy. But I really need both cores working as one will handle MIDI and the other non-time-critical tasks like updating the OLED display and interfacing with the touchscreen and the backing WS2812 led matrixes. Until then I just can't really release it. So effectively I'm now stalled. Not only do I have this issue but the WLAN stuff doesn't work on core 1 - it hangs on configuration and someone suggested adding the country code. This stopped the hangs but it still doesn't receive UDP packets so my RTPMIDI implementation (which works perfectly on core 0) is also stalled. I do appreciate the whole endeavour is being run by volunteers but it feels like the RPi Foundation really ought to contribute some support here as well, after all, they're putting these things forward as Micropython-programmable microcontrollers and they have dual cores, so that's kinda important. In any case the ESP32 and other controllers are now also multi-core. I get I could probably get all this working in C++ but I want the users to be able to program custom functionality and Micropython is really essential for the whole concept. What do we do to get these threading issues prioritised? On Tuesday, 8 August 2023 at 05:24:30 BST, Mike Teachman @.***> wrote:

I can reproduce the problem using Pin object interrupts. I used your example to create a minimal test to reproduce the failure. The processor sometimes locks up when pin 16 is pulled low, thereby creating the interrupt. Note that lockups don't happen every time, typically only 1 in 3 times when the pin is pulled low. There are many Github issues and discussions about IRQs and how they run in a RP2 multi-threaded program, for example micropython/micropython#10690 import _thread from machine import Pin import time

def pin_callback(p): print('callback')

class EventManager: @staticmethod def Start(onKeyDown, onKeyUp): while True: a = 1 time.sleep(1) print('core 1')

t = _thread.start_new_thread(EventManager.Start, (1,2))

p16 = Pin(16, Pin.IN, Pin.PULL_UP) p16.irq(handler=pin_callback, trigger=Pin.IRQ_FALLING) — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

peterhinch commented 1 year ago

@miketeachman This is a rather extreme repro because core 0 has no running code. With this remedied, and some action on pin 16, the sample runs (link P16-P17):

import _thread
from machine import Pin
import time

def pin_callback(p):
    print('callback')

class EventManager:
    @staticmethod
    def Start(onKeyDown, onKeyUp):
        while True:
            a = 1
            time.sleep(1)
            print('core 1')

t = _thread.start_new_thread(EventManager.Start, (1,2))

p16 = Pin(16, Pin.IN, Pin.PULL_UP)
p16.irq(handler=pin_callback, trigger=Pin.IRQ_FALLING)
p17 = Pin(17, Pin.OUT)
while True:
    time.sleep(1)
    p17(not p17())

Outcome:

<irq>
callback
core 1
core 1
callback
core 1
core 1
callback
core 1
core 1
callback
core 1
core 1
callback
core 1
core 1
callback
core 1
core 1
callback
core 1
core 1
callback
core 1
core 1
callback
...
ayjaym commented 1 year ago

Hmm, well yes although I did see the problem with code actually running on core 0 I'm pretty sure. Let me try a while loop with a sleep as a placeholder and see what happens.

ayjaym commented 1 year ago

Ok, well, this may be a stupid error on my part but the following code, while it does not hang, doesn't work if the encoder library is launched from core 1 - I do not get value changes when it is rotated, but I do of course get switch events because these are polled.

i.e the test harness is

from _thread import * from encoder import Encoder import time

def onEncoderSwitchDown(): print ("On Encoder Switch Down")

def onEncoderChange(v): print ("On Encoder Change" , v)

t = start_new_thread(Encoder.Start,(onEncoderSwitchDown, onEncoderChange))

Encoder.Start(onEncoderSwitchDown, onEncoderChange) while True: Encoder.CheckEncoder() time.sleep_ms(100)

and the class is below. This of course is uploaded to the /lib folder on the Pico and I'm then running the test harness from Thonny directly.

Obviously when run in core 1 uncomment out the while loop in the Encoder class and remove it from the harness, so that the thread performs the poll to look for an encoder value change.

This is just kinda ripped out of the larger project and in reality I would have polled the encoder in the WLAN module when it's polling for UDP datagrams as the overhead is negligible and the value change will occur via an IRQ. But - and I could have made a stupid error here - this code doesn't seem to process the IRQs on core 1 at all. I get this is different from the original problem of course.

from rotary_irq_rp2 import RotaryIRQ from machine import Pin import time

class Encoder:

old_val = None
encsw_val = None
OnEncoderSwitchDown = None
OnEncoderChange = None
encoder = None
encsw = None

@staticmethod
def CheckEncoder():

    v = Encoder.encoder.value()
    print(v)
    sw = Encoder.encsw.value()
    if sw != Encoder.encsw_val:
        Encoder.encsw_val = sw
        if sw == 0:
            Encoder.OnEncoderSwitchDown()

    if v != Encoder.old_val:
        Encoder.old_val = v
        Encoder.OnEncoderChange(v)

@staticmethod         
def Start(onEncoderSwitchDown, onEncoderChange):
    Encoder.OnEncoderSwitchDown = onEncoderSwitchDown
    Encoder.OnEncoderChange = onEncoderChange
    Encoder.encoder = RotaryIRQ(pin_num_clk=17, 
          pin_num_dt=19, 
          min_val=0, 
          max_val=6,
          pull_up=True,
          reverse=False,            
          range_mode=RotaryIRQ.RANGE_BOUNDED)

    Encoder.old_val = Encoder.encoder.value()
    Encoder.encsw = Pin(16, mode=Pin.IN, pull=Pin.PULL_UP)
    Encoder.encsw_val = Encoder.encsw.value()
    #while True:
    #    Encoder.CheckEncoder()
    #    time.sleep_ms(100)
peterhinch commented 1 year ago

IRQ's are processed on core 0.

In my experience core 1 should be reserved for computationally intensive tasks with great care taken when communicating between cores. I wouldn't expect classes such as sockets to work when shared between cores because the internal state of the class is not designed for a GIL-free environment. Core 1 is ideally suited for running blocking functions in a way which enables the core 0 code to continue running.

I believe there was discussion among the maintainers as to whether core 1 should run GIL-free. My view is to prefer the high performance solution that we have, but it does mean that an appreciation of multiprocessor coding has to be employed. I favour using asyncio for concurrency on core 0 with core 1 being reserved for code that has an absolute need for true concurrency.

The following allows an encoder to run on core 0 with its state being tracked on core 1:

import _thread
from machine import Pin
import time
import uasyncio as asyncio
from primitives import Encoder

position = 0  # Note constraints on shared globals see THREADING.md
change = 0

def core_1():
    while True:
        time.sleep(1)
        print(f"Position = {position} change={change}")

def cb(pos, delta):
    global position, change
    position = pos
    change = delta

async def main():
    t = _thread.start_new_thread(core_1, ())
    px = Pin(16, Pin.IN, Pin.PULL_UP)
    py = Pin(17, Pin.IN, Pin.PULL_UP)
    enc = Encoder(px, py, div=4, callback=cb)  # div matches mechanical detents
    while True:
        await asyncio.sleep(1)

try:
    asyncio.run(main())
finally:
    asyncio.new_event_loop()

Docs on encoder class and on THREADING.md.

ayjaym commented 1 year ago

Thank you very much for the valuable insights.To be clear I had no intention of sharing resources like sockets across cores.My intention was for the wireless rtpmidi code to run on core 1 along with the encoder. This is because the cost of polling the encoder for value changes is minimal.The core 1 code calls back to send available midi data or to accept midi data to transmit, this seems safe as we are just marshalling a 3 byte command and not accessing internal state across core boundaries. Also as you see we have a callback for encoder value change and switch down.Core 0 then runs code which the user will put together from building blocks. There is an oled display, an SD card reader and a touchscreen which provides 137 buttons which are backed by a ws2812 set of addressable LEDs.Code to communicate with these components will all run on core 0. This then allows the time critical midi operations to run on core 1.At some point an in memory data structure will be supplied to the core 1 code to define a sequence of midi events to be transmitted with timestamps. However at present I cannot achieve this. I would much prefer this architecture as the device user only adds custom code to core 0 and so it's much sinpler to understand for them. I had not anticipated the issues I encountered, I am experienced in multithread coding on actual operating systems and appreciate this is a bare metal implementation and I am aware of the need for locking but found that adding locks did not mitigate my initial issues, if I could resolve these I certainly realise the need to put locks around things like array allocations etc but at present the code doesn't do anything like that. So to summarise ideally I need the wlan subsystem to actually work on core 1 and the rotary encoder to work on core 1. This would make things much simpler for my intended audience. I hope this helps clarify where I was going, architecturally. I appreciate your input very much.

Sent from Yahoo Mail on Android

On Sat, 12 Aug 2023 at 8:42, Peter @.***> wrote:

IRQ's are processed on core 0.

In my experience core 1 should be reserved for computationally intensive tasks with great care taken when communicating between cores. I wouldn't expect classes such as sockets to work when shared between cores because the internal state of the class is not designed for a GIL-free environment. Core 1 is ideally suited for running blocking functions in a way which enables the core 0 code to continue running.

I believe there was discussion among the maintainers as to whether core 1 should run GIL-free. My view is to prefer the high performance solution that we have, but it does mean that an appreciation of multiprocessor coding has to be employed. I favour using asyncio for concurrency on core 0 with core 1 being reserved for code that has an absolute need for true concurrency.

The following allows an encoder to run on core 0 with its state being tracked on core 1: import _thread from machine import Pin import time import uasyncio as asyncio from primitives import Encoder

position = 0 # Note constraints on shared globals see THREADING.md change = 0

def core_1(): while True: time.sleep(1) print(f"Position = {position} change={change}")

def cb(pos, delta): global position, change position = pos change = delta

async def main(): t = _thread.start_new_thread(core_1, ()) px = Pin(16, Pin.IN, Pin.PULL_UP) py = Pin(17, Pin.IN, Pin.PULL_UP) enc = Encoder(px, py, div=4, callback=cb) # div matches mechanical detents while True: await asyncio.sleep(1)

try: asyncio.run(main()) finally: asyncio.new_event_loop() Docs on encoder class and on THREADING.md.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

ayjaym commented 1 year ago

And your comment on irqs and core 0 led me to Google and find this interesting thread because I said "how did he know that, I don't recall reading that in the docs" https://github.com/orgs/micropython/discussions/10638It looks like the whole multi threading area is evolving and that right now what I want to do isn't really feasible.Problem is that I don't want the user having to write a loop to capture encoder events on core 0. And I assume that possibly the wlan failures on core 1 may be because irqs aren't being handled in that scenario, as well. Now that the pi zero is becoming available again perhaps I need to redesign around that (probably the zero w2 ideally though I'm not sure about availability). That's a nuisance because I need two ADC channels for the touchscreen so will add significant cost. Or hope that multicore support on the rp2040 might be enhanced and just park this project for a few months. Or get my hands dirty on the code, myself, of course!  Sent from Yahoo Mail on Android

On Sat, 12 Aug 2023 at 8:42, Peter @.***> wrote:

IRQ's are processed on core 0.

In my experience core 1 should be reserved for computationally intensive tasks with great care taken when communicating between cores. I wouldn't expect classes such as sockets to work when shared between cores because the internal state of the class is not designed for a GIL-free environment. Core 1 is ideally suited for running blocking functions in a way which enables the core 0 code to continue running.

I believe there was discussion among the maintainers as to whether core 1 should run GIL-free. My view is to prefer the high performance solution that we have, but it does mean that an appreciation of multiprocessor coding has to be employed. I favour using asyncio for concurrency on core 0 with core 1 being reserved for code that has an absolute need for true concurrency.

The following allows an encoder to run on core 0 with its state being tracked on core 1: import _thread from machine import Pin import time import uasyncio as asyncio from primitives import Encoder

position = 0 # Note constraints on shared globals see THREADING.md change = 0

def core_1(): while True: time.sleep(1) print(f"Position = {position} change={change}")

def cb(pos, delta): global position, change position = pos change = delta

async def main(): t = _thread.start_new_thread(core_1, ()) px = Pin(16, Pin.IN, Pin.PULL_UP) py = Pin(17, Pin.IN, Pin.PULL_UP) enc = Encoder(px, py, div=4, callback=cb) # div matches mechanical detents while True: await asyncio.sleep(1)

try: asyncio.run(main()) finally: asyncio.new_event_loop() Docs on encoder class and on THREADING.md.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

ayjaym commented 1 year ago

I went back to the project today and confirmed that with code actually running on core 0 which is doing the RTP MIDI stuff with the WLAN plus handling the rotary encoder, that if I then have the touchscreen code running on core 1 (and no shared state except the callbacks when a touch event occurs), then an IRQ asserted on core 0 when the encoder is turned DOES lock up the processor. So it wasn't just a scenario where I'm dropping back to the REPL on core 0. I am not entirely sure why this doesn't repro directly but I'm using the Pico W, not sure if this might be material.

I don't think there's anything specific to the core 1 touchscreen code, which does not use any IRQs, it just uses the ADC channels to read the resistive touchscreen and then send events for touch and release. I guess I'll need to read the encoder by polling and debouncing manually, which is a shame, as your library is such an elegant technical solution, but at least then I can work around all the other threading issues by just running the MIDI stuff on core 0, the touchscreen and OLED and SD card reader stuff ought to be able to run on core 1 (touchscreen code certainly does) and I can then just implement a standard request block protocol queue which core 1 will poll to pick up e.g a request to display a message on the OLED screen etc, so that the MIDI code on core 0 won't jitter. It's not what I would like but needs must, I have to work within the current limitations of multicore support and just hope at some point Micropython will be a bit more robust in this area. Really appreciate your comments, I learned a lot about internals from reading them!

peterhinch commented 1 year ago

I'm using the Pico W, not sure if this might be material.

It's not. I used a Pico W to check the above script. The Pico W behaves identically to the Pico because WiFi is handled by a separate chip: a very nice design.

If you're going to write your own encoder code, you might like to read this. Encoders are surprisingly subtle, and there is a lot of nonsense written about them with absurdly complex algorithms. Hint: with the right algorithm you don't need to debounce them.

Here is another approach to design, which keeps all the hardware interfaces on core 0. Use asyncio and my encoder class on core 0. Run an asyncio task which reads the ADC. Whenever there is a significant change, the task puts a pair of readings onto a threadsafe queue. This queue is read by your code on core 1.

hope at some point Micropython will be a bit more robust in this area

I'm not sure that there is a fault in MP. Writing multiprocessor code is notoriously difficult. Subtle bear-traps abound. It's easy to encounter a situation where there is shared state whose existence is hidden. My strong advice is to use asyncio for concurrency on core 0. Keep all the "difficult" interfaces on core 0. Use core 1 only for processes which would block core 0 and follow the guidelines in THREADING.md for inter-core communication.

ayjaym commented 1 year ago

Thanks for your very valuable suggestions. You are quite right that multiprocessor code is tricky. I have spent years working with spinlocks, mutexes, critical sections, memory barriers, atomic operations etc in C# and C++ primarily and I certainly have managed to create some very subtle bugs in the process.

Just for clarity, it does appear for the code below that IRQs asserted on core 0 with code running (i.e not dropping back to the REPL) and with a trivial piece of thread code running which - I assume - can't really expose any kind of multi-threading "beartrap" DO cause the whole processor to lock up. This is 100% reproducible, every time. One movement on the encoder and it locks up.

Why this isn't entirely consistent with some of the other attempts to reproduce the issue I'm not sure, However it may relate to the number and timing of IRQs that occur with the rotary encoder I'm using.

At any event the test harness code works perfectly without the thread created, but locks up as soon as the encoder is turned, if the thread is created as part of the invocation process.


from _thread import *
from encoder import Encoder
import time

def onEncoderSwitchDown():
    print ("On Encoder Switch Down")

def onEncoderChange(v):
    print ("On Encoder Change" , v)

def breakit():
    print ("thread start")
    while True:
        time.sleep_ms(100)

t = start_new_thread(breakit,()) # this will cause any IRQ later on to lock up the processor
Encoder.Start(onEncoderSwitchDown, onEncoderChange)
while True:
    Encoder.CheckEncoder()
    time.sleep_ms(100)

and the Encoder module is just a wrapper around the standard encoder library

from rotary_irq_rp2 import RotaryIRQ
from machine import Pin
import time

class Encoder:

    old_val = None
    encsw_val = None
    OnEncoderSwitchDown = None
    OnEncoderChange = None
    encoder = None
    encsw = None

    @staticmethod
    def CheckEncoder():

        v = Encoder.encoder.value()
        sw = Encoder.encsw.value()
        if sw != Encoder.encsw_val:
            Encoder.encsw_val = sw
            if sw == 0:
                Encoder.OnEncoderSwitchDown()

        if v != Encoder.old_val:
            Encoder.old_val = v
            Encoder.OnEncoderChange(v)

    @staticmethod         
    def Start(onEncoderSwitchDown, onEncoderChange):
        Encoder.OnEncoderSwitchDown = onEncoderSwitchDown
        Encoder.OnEncoderChange = onEncoderChange
        Encoder.encoder = RotaryIRQ(pin_num_clk=17, 
              pin_num_dt=19, 
              min_val=0, 
              max_val=6,
              pull_up=True,
              reverse=False,            
              range_mode=RotaryIRQ.RANGE_BOUNDED)

        Encoder.old_val = Encoder.encoder.value()
        Encoder.encsw = Pin(16, mode=Pin.IN, pull=Pin.PULL_UP)
        Encoder.encsw_val = Encoder.encsw.value()
        #while True:
        #    Encoder.CheckEncoder()
        #    time.sleep_ms(100)
peterhinch commented 1 year ago

Please could you edit this post so that all the code is enclosed in treble backticks. Then the indention is preserved making it easy to read.

ayjaym commented 1 year ago

Sorry, of course, I didn't know about that, have added them. So hopefully this code example shows that it seems to be impossible to use core 1 at all if you want to assert IRQs on Core 0 - at least, with a real rotary encoder connected and using the RotaryIRQ module. Unless of course there's something wrong with my hardware. I have plenty more of these, so I can re-test on a different Pico W if necessary or indeed a standard Pico. If I could get this working I could probably just have the WLAN code and rotary check on core 0 and put the other stuff - which doesn't use IRQs and shouldn't expose any threading issues - on core 1. If not I could try and get everything working with async and I appreciate your suggestion, but it is a huge shame to have a whole processor core unused - it would really help enormously to be able to move some of the processing to core 1.

peterhinch commented 1 year ago

So hopefully this code example shows that it seems to be impossible to use core 1 at all if you want to assert IRQs on Core 0

It doesn't really because my sample here shows the opposite: my library uses IRQ's and the sample ran with a real encoder. Rather than trawl through the RotaryIRQ code why not give my library a try?

ayjaym commented 1 year ago

I definitely did not drink enough coffee! Should have looked closer at your code to realise you weren't calling the same module. I will try that and see.

peterhinch commented 1 year ago

I've had a look at Mike Teachman's driver. It uses a different approach to mine. My driver attempts to minimise the runtime of the ISR's: the ISR code is as small as I could make it. RotaryIRQ has potentially quite a long runtime, especially if the Listener mechanism is used. The reason this matters (in my opinion) is as follows.

Using IRQ's to handle switch contact closures is undesirable because of contact bounce. This can cause arbitrarily short pulses and - worse - invalid logic levels. The risk of the latter can be mitigated with low value pullup resistors which I recommend. Short pulses can cause re-entrancy if the ISR runtime is long. Alas IRQ's are necessary for an encoder interface, hence my approach of minimising ISR runtime.

Another design point is that my callbacks are run from an asyncio context. This entirely decouples the callback from the ISR.

My ISR's run in 36μs on a Pyboard 1.1 and 50μs on ESP32. Note that soft ISR's still run fast, but are subject to long latency. The ISR design aims to allow for potential latency. Hard ISR's are used if available (e.g. on RP2) because latency sets a limit on allowable rotational speed.

To be fair I can't identify a specific reason why RotaryIRQ is failing in a dual-core environment.

Please note that my Switch class does not use IRQ's: normal switch and pushbutton operations are much better handled by asyncio polling.

ayjaym commented 1 year ago

Yes, I have to agree with your analysis and can confirm that indeed your code runs correctly. I have written time-critical interrupt code previously and indeed you want to be in and out as quickly as possible. I apologise for not looking more closely at what you were originally proposing, where your comments were that core 1 could track rotary encoder changes on core 0 and I stupidly thought 'yeah but that's not what I want to do' without actually looking in detail at your code. Now I need to read and digest the whole asyncio library and have a second read through threading.md. I have a real sense of deja vu in that WAY back in time I used to write TSR (terminate and stay resident) code for MSDOS which definitely hits many of these issues. I need to think about how I can implement this model in that the project is intended to be relatively easy for the user to customise. So I provide services for writing to the OLED display, reading/writing to the SD card reader, reading the button matrix (which is implemented as I said via buttons on top of a resistive touchscreen, and writing to the underlying WS2812 LED array. Some of these functions will take time although they could be carefully cut up into bits - but cooperative multitasking isn't much fun and I don't really miss Windows 3.x :) I would ideally like the user to just write code and poll the encoder in their code to get state change callbacks (as in my original code, because it's not time critical, if they don't poll, the IRQs ensure that the state is kept accurate anyway), although the alternative is just to pass a 'YourCode' function in which will be executed regularly, which obviously the current async encoder architecture you have can do. [I can probably set something up around timers of course, need to think on this] And I have to think carefully about the wireless MIDI stuff, which I had wanted to run on core 1 out of the way. That doesn't look like it'll work for as yet undetermined reasons (possibly IRQs but who knows?) Still, this is just good architectural design, at least I can see it IS theoretically possible although the caveats in THREADING.md are going to be significant. Alas, I'm spoiled with C# and intrinsically threadsafe collections etc.

I'm not clear whether 'globals' as opposed to say static class-scoped variables hit the same problems with initialisation. Certainly my reasonably complex touchscreen code did seem to run ok on core 1 with the MIDI stuff running on core 0 and I hadn't done anything specific to lock because there's no explicit shared state. But whether that was just luck I don't know. Most of the stuff I was just wrapping in static classes because there's no need to instantiate anything other than a singleton - there is only one of each of the I/O peripherals and so on. Hence classes give me Separation Of Concerns at least. Anyway, many thanks. I had not realised the depth of information in this project which is probably unavailable elsewhere so it behooves me to just go through and read everything and experiment.

peterhinch commented 1 year ago

cooperative multitasking isn't much fun

I think it is. But that's beside the point: it's damn near essential in firmware.

I suspect that your experience is in PC programming rather than firmware. I first got involved with MP nine years ago when it was in its infancy. It's asyncio was useless with a time resolution of 1s. I wrote my own scheduler with ms resolution because, in my experience, cooperative multi tasking is essential in almost all firmware projects. Since then MP has acquired an absolutely excellent asyncio so I abandoned mine long ago.

I urge you to learn asyncio - I wrote the tutorial in an attempt to promote its use.

Tools such as interrupts, timers and threads have vital uses but they bring significant complexity compared to concurrent cooperative tasks. Unfortunately people new to MP look at these tools and think they are the answer to concurrency, using them in quite inappropriate ways. My approach is to use these tools only when the requirement is unachievable with asyncio. I suggest you consider a design based on this principle.

Incidentally I too wrote TSR's but not professionally. I first encountered cooperative scheduling when, in about 198, I took on a professional firmware project from a firm that had gone bust. It was written in Intel 8080 Assembler and was a revelation. I realised I'd spent the previous five years writing firmware the wrong way...

ayjaym commented 1 year ago

Well, yes, I do have quite a bit of firmware experience but that was a long time back working with the Microchip PIC devices with a massive 1K of program storage and 128 bytes of RAM. These modern microcontrollers are orders of magnitude more complex and often not terribly well documented. I certainly agree though 100% with what you say. Clearly this is a sophisticated and well engineered toolset and not learning it would be just damn stupid on my part, so I need to spend some time getting my head around how things work.

For example in C# async.. await work under the hood by actually using a second thread so that the async routine really returns twice, once to the caller immediately and then a second time on a different thread back to the runtime when the operation actually completes. But in Micropython this must operate a bit differently since you're clearly not going to spawn an actual thread, there being no OS to do this, so the underlying scheduling subsystem etc. is presenting the abstraction of a small cooperative multitasking OS to the caller, I assume, in order to actually implement this. Which is pretty amazing because really this is a small near RTOS you're effectively building, at least, at the task management level - you're obviously deferring other things out to the bare metal like the filesystem - but fortunately there are of course other modules to take care of this. One good thing is it seems possible to set the Pico clock rate to 250MHz without anything obviously failing - I think this is about as far as you can go before flash clock speed limitations become an issue - but this substantially increases the processing power at a fairly negligible current drain increase as far as I can see. My whole project comes in at 420mA on the USB bus which for everything including 137 LEDs is pretty good. As I said, hardware works perfectly, I'm delighted with it, the whole thing is 3d printed, it's cheap to build, I had just underestimated the software challenge but, that's life. I have learned such a lot from interacting with yourself here and I do appreciate your time, effort and patience. I'll try and get this damn thing done, and released as soon as I can, now.

peterhinch commented 1 year ago

The fundamental principle behind asyncio is simple (although the implementation is extremely clever). If you understand Python generators you have the gist of how coroutines may be implemented: a coroutine is a special case of a generator. A coroutine can yield at chosen points and resume execution at the instruction after the yield. The scheduler maintains a list of suspended coroutines and transfers execution to them in a way that ensures that each gets a share of the action. While this is highly simplified, there is no "magic" going on. It is pure Python, no threading is involved and there is no true concurrency.*

The code may be viewed here.

*The actual implementation has the task.py module duplicated in C for performance.