MicroPython: Improve QTR read consistency

adamgreen commented 1 year ago

I noticed an issue with inconsistencies in the QTR readings when porting the PIO and corresponding QTR code from MicroPython to my 3pi+ 2040 Arduino Library project. Increasing the charge time to 200us like in the MicroPython code worked around the problem but originally I just ported the CPU version of the QTR reading code from the Pololu 3pi+ 32U4 Arduino Library and it only needed 10us of charge time to get consistent readings. This made me suspicious that something about the PIO code path that I ported from MicroPython contributed to the inconsistent readings I was seeing with shorter charge times. After attaching an oscilloscope to one of the QTR sensor pins I saw what was going on and was able to track down the issue in the PIO code path:

Originally the QTR driver would have the PIO code start the charging phase at the end of the previous read.
The CPU was responsible for kicking off the next read by the PIO.
The PIO would then start the discharge phase and start timing the discharge for each light sensor pin.
Depending on what the CPU was used to do after the previous read completed and before starting the next read, the capacitors may have ended up with a different level of charge and this would impact the time it took to discharge.
The PIO code should be responsible for the timing of the charging phase as well as the discharging phase or the readings can be inconsistent.

This commit includes a change which makes the PIO code responsible for charging the capacitor for 8us, consistently on each and every sample, before it switches the sensor pins to be inputs and measuring the discharge time. I also removed the CPU delay as it is now taken care of by the PIO code itself.

This commit also includes a change which initializes the X register to 7 bits of 1 as that should be the initial state of the sensor pins when fully charged.

pdg137 commented 1 year ago

Thanks for the PR! To be clear, this inconsistency was only a problem after porting the PIO algorithm to C, right? In C, it's probably better to do the capacitor charging outside of PIO since it's not timing-critical and that saves a bunch of PIO instructions.

In Python, I like the change for a different reason - it's better to keep the lines as inputs when not in use to save power and just generally keep things inactive until you need them. You say the capacitors need ~10us to charge; so could you increase the delay to ~16us instead of ~8us to be sure to charge them completely?

Separately, I don't think we can remove the 200us bump sensor delay. That has to do with the bump sensors responding slowly to IR and needing 200us to reach a stable value; I don't think the 16us before sensor reading is going to be enough.

adamgreen commented 1 year ago

Thanks for the PR! To be clear, this inconsistency was only a problem after porting the PIO algorithm to C, right? In C, it's probably better to do the capacitor charging outside of PIO since it's not timing-critical and that saves a bunch of PIO instructions.

I don't think so. I think the 200us in the MicroPython code was probably compensating for this issue too. Photo transistors shouldn't take 100's of µs to respond.

It is probably always better to do the charging in the PIO code since it can get consistent timing without needing to disable interrupts, etc. that would be required on the CPU side.

In Python, I like the change for a different reason - it's better to keep the lines as inputs when not in use to save power and just generally keep things inactive until you need them. You say the capacitors need ~10us to charge; so could you increase the delay to ~16us instead of ~8us to be sure to charge them completely?

Yes, that is a 1 character change. If we found it worked better at 16µs than 8µs then that is something to consider. I was erring on the side of using less time than more.

Separately, I don't think we can remove the 200us bump sensor delay. That has to do with the bump sensors responding slowly to IR and needing 200us to reach a stable value; I don't think the 16us before sensor reading is going to be enough.

That hasn't been my experience so far. Maybe your lighting conditions are different. I think the 200µs needed for a stable value was a side effect of this bug and not actually the photo transistor taking that long to respond to a light source.

adamgreen commented 1 year ago

In the C++ code, I have come up with a way to reduce the PIO code size by 3 instructions by doing some of the state machine register initialization on the CPU instead. I can make this same change to this MicroPython PR as well if you decide you want to take it.

You can see my C++ change here: https://github.com/adamgreen/pololu-3pi-plus-2040-arduino-library/commit/794b2b8aee7b07e6a5259c3cfa29462916aff3da

pdg137 commented 1 year ago

Regarding the 200us delay, we are doing some quick tests with this simple script:

import time
from pololu_3pi_2040_robot import ir_sensors

bump_sensors = ir_sensors.BumpSensors()

while True:
    r = []
    for i in range(5):
        r.append(list(bump_sensors.read()))

    print(f"Readings: {r}")
    time.sleep_ms(1000)

Here's what the results typically look like with your code (buttons unpressed):

Readings: [[151, 138], [118, 110], [118, 110], [118, 108], [120, 111]]

The first readings are significantly higher than later ones, which I'm assuming is because the phototransistor is at least somewhat sensitive to its illumination over the last few hundred microseconds.

Do you see the same effect and have any other explanation?

pdg137 commented 1 year ago

I like the idea of saving PIO space by setting up the state machine more from the CPU side, but I know an easy way to do it in MicroPython. I think all we have is StateMachine.exec which takes a string argument and compiles it into PIO code - much too slow for this purpose.

Unless I'm missing something, the way to get there would be to make these functions more accessible in MicroPython, maybe by letting exec optionally take a numeric value?

adamgreen commented 1 year ago

Do you see the same effect and have any other explanation?

I will take a look at it now and see if I can reproduce and debug it. Thanks so much for taking a look at my code.

adamgreen commented 1 year ago

Are you only seeing the higher reading on the very first reading? When I run it, I see a higher reading on the very first reading and then every subsequent iteration, it gives similar readings:

Readings: [[114, 157], [83, 106], [81, 106], [82, 106], [82, 106]]
Readings: [[84, 112], [82, 106], [82, 107], [82, 106], [82, 107]]
Readings: [[85, 113], [82, 106], [82, 107], [82, 107], [82, 107]]
Readings: [[85, 113], [82, 107], [82, 106], [82, 107], [82, 107]]

I would suspect that that can be explained by the fact that until your program starts, the default pull-up on the pins is charging up the capacitor slowly.

pdg137 commented 1 year ago

It's every time (but the first is indeed notably worse):

>>> %Run -c $EDITOR_CONTENT
Readings: [[201, 179], [122, 113], [122, 112], [121, 111], [122, 112]]
Readings: [[153, 139], [122, 112], [121, 111], [121, 113], [120, 112]]
Readings: [[147, 133], [123, 112], [119, 109], [122, 112], [120, 110]]
Readings: [[153, 138], [121, 113], [121, 111], [121, 111], [122, 112]]
Readings: [[150, 135], [121, 111], [121, 112], [119, 109], [121, 111]]

If I add back the 200us delay it actually flips a little:

>>> %Run -c $EDITOR_CONTENT
Readings: [[170, 154], [111, 104], [112, 104], [113, 104], [112, 105]]
Readings: [[105, 100], [112, 105], [112, 104], [111, 104], [113, 105]]
Readings: [[106, 100], [112, 104], [111, 104], [113, 104], [112, 105]]
Readings: [[106, 99], [112, 105], [112, 103], [112, 103], [112, 105]]
Readings: [[105, 99], [111, 104], [113, 106], [113, 104], [112, 104]]

To be clear, this is with your code copied over ir_sensors.py in the default firmware.

adamgreen commented 1 year ago

I found that if I put the robot right under my fluorescent light I could get a bit more variability like you are seeing where the first sample in an iteration would give lower readings. I believe that this is happening because the capacitor is continuing to discharge even more during the sleep so that the charge cycle doesn't get quite as much charge on the cap.

I tried increasing the charge time to 16us and that helped a little bit.
I also tried changing the timeout to 2048 instead of 1024 and that gave more consistent results.

--- a/micropython_demo/pololu_3pi_2040_robot/ir_sensors.py
+++ b/micropython_demo/pololu_3pi_2040_robot/ir_sensors.py
@@ -4,7 +4,7 @@ import time
 import rp2
 from rp2 import PIO

-TIMEOUT = const(1024)
+TIMEOUT = const(2048)
 _DONE = const(0)
 _READ_LINE = const(1)
 _READ_BUMP = const(2)
@@ -34,7 +34,7 @@ class QTRSensors:
         jmp(y_dec, "charge")

         # Load 1023 (10 bits of 1s) into Y as a counter
-        out(y, 10)
+        out(y, 11)

         # Initialize X (last pin state) to 7 bits of 1s.
         out(x, 7)

I did notice that the 32U4 Arduino library used a timeout of 4096.

If increasing the timeout helps you as well, you could try putting the timeout back to 1024 and increasing the charge time to 32us by shifting in 2 more bits from the OSR to see if that is enough to give you consistent results as it would keep the overall sample time lower.

adamgreen commented 1 year ago

Unless I'm missing something, the way to get there would be to make these functions more accessible in MicroPython, maybe by letting exec optionally take a numeric value?

Sorry for not replying to this post earlier tonight. I was busy reproducing the problems that you were highlighting that still existed with my PR and forgot about this one.

I looked at the C code of MicroPython and have to agree with you. I think we could use the PIOASMEmit class to generate the machine code for the 3 needed instructions at init time and store them in the object. We could then reach around and poke the PIO registers directly to execute them on each restart. I don't see a direct way to get the PIO and StateMachine indices to know which instruction register we need to set but I don't think it is that hard to figure out which one is the right one though. Just record the PIO::CTRL register for both PIO instances before enabling our state machine and after. The change in 1 of the SM_ENABLE bit masks will tell us which state machine slot our code is running in.

adamgreen commented 1 year ago

Actually after looking at this some more I think increasing the charge time to 32µs gives the most consistent readings.

So rather than the diff I listed above, maybe try this one instead:

--- a/micropython_demo/pololu_3pi_2040_robot/ir_sensors.py
+++ b/micropython_demo/pololu_3pi_2040_robot/ir_sensors.py
@@ -27,9 +27,9 @@ class QTRSensors:
         # Set pindirs to 7 bits of 1s to enable output and start charging the capacitor.
         out(pindirs, 7)

-        # Charge up the capacitors for ~8us.
-        # Set Y counter to 63 by pulling another 6 bits from OSR.
-        out(y, 6)
+        # Charge up the capacitors for ~32us.
+        # Set Y counter to 255 by pulling another 8 bits from OSR.
+        out(y, 8)
         label("charge")
         jmp(y_dec, "charge")

Maybe the bumper emitter LEDs are just not as bright when they have been given time to cool down between readings? I guess it could also be as you originally said, the QTR takes time to adjust to changes in light so the longer the LED has been off between readings, the slower the QTR will discharge the capacitor. I didn't think a QTR would do this but it seems more likely than an LED being dimmer when it is cooler.

adamgreen commented 1 year ago

I have updated this PR to charge the capacitor for 32µs from the PIO code.

Should I investigate doing some of the PIO state machine register initialization from the CPU under MicroPython to reduce PIO code size?

pdg137 commented 1 year ago

Thanks; I merged this. There's still some remaining variation in the readings, but the 32us delay seems to take care of most of it, and I like how this is simpler and the PIO takes care of most of the work.

I don't think it's worth messing around poking values in RAM or whatever you would need to do now to initialize the PIO state from the CPU. Instead I recommend directing some PRs at MicroPython to make the relevant features from pico-sdk accessible more cleanly in Python. Like I was saying earlier, it seems like the simplest way to start would be to allow exec to take a numeric argument.

adamgreen commented 1 year ago

I don't think it's worth messing around poking values in RAM or whatever you would need to do now to initialize the PIO state from the CPU. Instead I recommend directing some PRs at MicroPython to make the relevant features from pico-sdk accessible more cleanly in Python. Like I was saying earlier, it seems like the simplest way to start would be to allow exec to take a numeric argument.

The PR I made to the MicroPython project with this functionality has now been merged: https://github.com/micropython/micropython/pull/11390

Thanks to you guys for reviewing and providing feedback on that PR on Discord and GitHub. Much appreciated!

I also created PR #7 in this repository for your consideration. It takes advantage of this new MicroPython functionality.

pololu / pololu-3pi-2040-robot

MicroPython: Improve QTR read consistency #1