peterhinch / micropython-async

Application of uasyncio to hardware interfaces. Tutorial and code.
MIT License
742 stars 168 forks source link

Clarification in ENCODER for _tsf_ready() #128

Closed michaelrommel closed 1 month ago

michaelrommel commented 1 month ago

Dear Peter,

thank you so much for the detailed write ups and examples. I am using your Encoder class in one test project and would like to understand one particular section of the code. I did not find a discussion or wiki here, so I raised this issue, I hope that this is ok with you.

https://github.com/peterhinch/micropython-async/blob/5575954a7956f30d370edef2cc8b900a21f6011d/v3/primitives/encoder.py#L93

Here to my understanding this ready function tests, if at the time this code is reached, the ThreadSafeFlag is already set. Three questions arose for me:

  1. In which scenario would it be a problem, if the flag is not cleared and the initialisation runs directly into the delay wait. I could only think of a scenario where the encoder is already rotating, when the code starts and why would we then NOT want to wait?
  2. How is it, that the ThreadSafeFlag can be polled to understand if it is set? I thought polling looks, if data can be read from something like a network socket or serial bus, but how can a flag not be read if it is not set? This is very dark magic for me and a little explanation would absolutely be appreciated...
  3. Why did you use the .wait() function to clear the flag instead of the .clear() function?

Thank you in advance for your time!

Michael.

peterhinch commented 1 month ago

Thank you for raising this. ThreadSafeFlag has acquired a clear method since I wrote this code.

.clear wasn't documented in the tutorial. I have now remedied this.

At the time the only way to clear down the TSF was to wait for it to be set, then to wait on it. So the purpose of this code

            if delay > 0 and self._tsf_ready():  # Ensure ThreadSafeFlag is clear
                await self._tsf.wait()
            await self._tsf.wait()

is to ensure that it waits for an edge. Without the first two lines, if the tsf was already set the delay would start without waiting for an edge. I think this could be reduced to

            tsf.clear()
            await self._tsf.wait()

The delay (if specified) starts on an edge. The purpose of the delay is to rate-limit the callback and the rate of triggering the asynchronous iterator.

You might like to experiment with the sample code to get a feel for the use of the poll mechanism.

Please don't hesitate to ask if you require further clarification.

peterhinch commented 1 month ago

I should add that the reason you can poll a TSF is because it has an ioctl method. See code. [EDIT] I have now amended the code to use the .clear method.

michaelrommel commented 1 month ago

Hi Peter,

many thanks for the explanations and informative links. I am now working through them and will edit this comment, once I have sorted all out...

CU later,

Michael.

peterhinch commented 1 month ago

I have pushed an update to the tutorial which aims to clarify the ThreadSafeFlag material. Returning to that section after a period made me appreciate that the explanation was too terse.

Feedback on the doc is welcome.

michaelrommel commented 1 month ago

I have drawn a small image, describing the delay and the latency issue, you solved in your code. Let me see, if I get it right:

encoder_pulsewave

  1. Latency

The issue is arising, if we have e.g. contact bounces on a manually controlled encoder (I am ignoring high speed encoders like on CNC machines here) Assuming that at least not a single contact bounces, but that both contacts on x and y bounce so quickly that the ISR starts, when the x signal is again on the same level as before. This is the purple shaded section. So with your code that compares the previous value with the current value, this super short bounce is effectively ignored. Which works great by the way... This is just a confirmation that I got the understanding right. Did you ever encounter a situation where only e.g. the x signal flipped back and forth and not the y signal (the teal section)? I understand it depends on the mechanical properties of the encoder and on a typical Keyes KY-040 there seem to be mechanical contacts gliding over a notched plate. So I guess here the bounce will probably be one pin only, as the second one has not reached the edge to the notch.

Another assumption of mine is, that when the x and y pins change in very quick succession (but no bouncing), then even though the y pin changes when still the x ISR runs, still the y ISR will be scheduled, it just runs shortly after the x ISR finishes.

  1. Now to the delay and the clearing of the TSF:

If the microcontroller initialises the code init will define the ISRs on lines 54/55, then the async task is created and enters the _run loop. So during that time, it may happen, that the encoder has already been turned and the ._v value has already reached 4. So if the next turn happens a longer time after the first, then we would ignore those first 2 for this period. I was thinking of moving the .clear() rather to the end of the _run() function at line 102, after we have obtained all current values. Maybe this is nitpicky, because in a continuously running loop it would also be the next instruction to be executed. I was just wondering about the reason of having it at the beginning...

Anyway I think I have now a solid understanding of your code and it works beautifully in my lighting setup. https://github.com/michaelrommel/tvframe.git

I am still making tweaks and will be converting the button and touch classes to the same style of minimal ISRs, ThreadSafeFlags and async iterators. But I needed to understand all of it first - just blindly copying code from the Internet without grasping what's going on is not my style...

Thanks a lot for your support and the updated documentation. It helped really a lot! I am now closing this issue!

Have a nice weekend!

Michael.

Nota bene: I tried this now on another encoder/pico and just what I assumed, if I get contact bounce, it is on one signal only, like so:

pico > mpremote connect /dev/tty.usbmodem7434201 fs cp uuu.py : + soft-reset exec "import uuu" repl
cp uuu.py :
main
x-callback 1 0 1
y-callback 1 1 2
x-callback 0 1 3
x-callback 0 1 3
x-callback 0 1 3
y-callback 0 0 4

Nota optime: I noticed that there may be an async missing in line https://github.com/peterhinch/micropython-async/blob/b4becb1362e90629b1c33719a4f961f06854992e/v3/primitives/encoder.py#L107 It is present in some other of your code examples. I tried it with and without, both seem to work, only the python documentation says __anext__ shall be declared as async

peterhinch commented 1 month ago

You are right about __anext__ - I've fixed this. I haven't found any other instances of the error, but please report if you're aware of any.

Re encoders I don't know if you've seen this doc. I wrote it in response to various people who insisted that you had to write a state machine to decode an incremental encoder and others who argued for explicit debouncing of the signals. There is more nonsense spouted on this topic than you can shake a stick at.

High speed encoders on an NC machine also suffer from contact bounce, caused by vibration. However I have never encountered simultaneous bounce on both channels. This should be physically impossible.

Re clearing the TSF, the general principle is that the ISR's maintain a correct count. The _run routine ensures that the callback runs in an asyncio context and can therefore perform any Python operation. You may appreciate that there are some nasty potential side effects of running a CB in a hard interrupt context. However _run does not influence the underlying accuracy of ._v. Clearing the TSF is non critical, and I believe the code would work without it. It does ensure that a stopped encoder pauses at ._tsf.wait() consuming no resources.

Re absolute accuracy I believe that my approach produces results which are as good as can be achieved with a Python solution. It's not perfect. An encoder can produce arbitrarily short pulses which may or may not trigger interrupts. Even hardware solutions require several stages of synchronisation to avoid metastability issues. If you examine the specification for the STM microcontroller on the Pyboard (which has an encoder interface) it has such synchronisation.

I can claim some experience here. In the mid 70's I designed a hardware interface for encoders on a machine tool. It was essential that this never missed an edge. Achieving this was difficult, however I eventually got it working. I estimate that it probably processed about 500 billion edges in the lifetime of the project. It used the XOR algorithm.

michaelrommel commented 1 month ago

Thank you Peter! I agree with all of your comment. Yes, I read through that linked doc as well.

Somehow I think that the Pico W still has some problems with micropython. In my code I have now three asyncio tasks running in app.py, which deal with the three async for loops getting data from the encoder, the encoder's switch and the touch button. Each of the instance runs such a _run() loop, doing all the python stuff, after it got triggered by the ISR. On the surface, the code runs fine and pushing the encoder's button produces the same calls to the NeoPixel strip as the touch button. However, sometimes, even if it is absolutely the same code path (all triggers the same tvled.toggle() function), the strip does not toggle when I touch the sensor, whereas the strip toggles when I use the encoder button.

I know that the rp2 implementation for NeoPixel uses bitstream and not PIO to push the data to the strip, but that does not explain, why the same call triggered from a different asyncio task works not...

I read that Adafruit removed lots of micropython's functions in their CircuitPython version:

Concurrency within Python is not well supported. Interrupts and threading are disabled. async/await keywords are available on some boards for cooperative multitasking. Some concurrency is achieved with native modules for tasks that require it such as audio file playback.

Maybe that is the reason it behaves so flaky for me.

peterhinch commented 1 month ago

It's hard to comment without further information however I use Pico and Pico W for most of my efforts these days because they are cheap, well specified and support hard IRQ's. My test rigs for micro-gui and micropython-touch use them, and two of the configurations of micro-gui use an encoder. Internally the GUIs make heavy use of asycio. I have not encountered any issues.

Concurrency within Python is not well supported.

To a point.

CircuitPython has a different target audience to MicroPython. If they disable hard IRQ's and threading it's because they require expertise to use properly - a level of know-how that their user base may not possess. Multi-core coding in MicroPython is really quite challenging.

I choose MicroPython because these techniques are available, the limitations are documented, and we're grown-ups :)

michaelrommel commented 1 month ago

Thanks - that is reassuring to know, that it is very stable for you. I decided to flash the Pico again with the micropython 1.23.0 in case something got corrupted during my tests with neopixels and the PIO and I had to take out the Pico from the enclosure to get to the BOOTSEL button. When I put it back I saw the same strange artifacts with the Neopixel strip - I guess the perfboard hardware may have a bad solder joint somewhere... But I want to fire up KiCAD and create a proper PCB like I did with my nightliner keyboard

Anyhow, if you like to see what I am talking about, here is the repo - I mentioned your great support also there in the README. https://github.com/michaelrommel/tvframe

Thanks again, Peter, for your support! Stay healthy and enjoy life!