peterhinch / micropython_ir

Nonblocking device drivers to receive from IR remotes and for IR "blaster" apps.
MIT License
240 stars 51 forks source link

Inconsistent playback timings on Pi Pico W #28

Closed jonadshead closed 3 months ago

jonadshead commented 11 months ago

Hi,

Am I doing something wrong in my setup because I get inconsistent results when I playback data on the Pi Pico W.

Here are the results of a test where I played the data (via multiple sources) and recorded it:

original_data = [4502, 4526, 518, 606, 518, 602, 518, 1722, 542, 1702, 518, 602, 518, 1726, 542, 578, 518, 602, 518, 606, 518, 602, 518, 1722, 546, 1698, 518, 602, 518, 1722, 546, 578, 518, 602, 518, 1722, 546, 1698, 518, 1722, 542, 582, 514, 1726, 542, 578, 518, 602, 522, 602, 518, 602, 518, 606, 518, 602, 518, 1722, 546, 578, 518, 1722, 542, 1698, 518, 1726, 542, 1000] playback_1 = [4340, 4563, 414, 708, 394, 689, 441, 1844, 426, 1838, 394, 711, 419, 1876, 417, 736, 2003, 3155, 1760, 1690, 1415, 3129, 4873, 4946, 2579, 4583, 2908, 5954, 1522, 2503, 8468, 268, 142, 701, 258, 638, 706, 1604, 287, 1872, 694, 390, 441, 1840, 421, 925, 524, 422, 424, 868, 348, 679, 653, 395, 468, 1747, 229, 898, 442, 634, 422, 1821, 472, 1789, 419, 1843, 442] playback_2 = [4313, 4583, 393, 709, 420, 684, 420, 1821, 468, 1816, 396, 710, 420, 1848, 442, 686, 416, 687, 444, 719, 438, 691, 2392, 3166, 4702, 4215, 5419, 3657, 4792, 2285, 5763, 3605, 2155, 8555, 94, 352, 602, 497, 147, 751, 422, 711, 559, 1732, 678, 447, 1554, 582, 527, 607, 490, 665, 550, 554, 524, 2006, 546, 449, 681, 1530, 524, 1639, 544, 1924, 419] playback_3 = [4326, 4568, 431, 673, 416, 706, 447, 1798, 440, 1791, 469, 663, 468, 1801, 464, 637, 443, 688, 599, 584, 732, 2189, 4840, 3238, 4259, 5167, 3519, 4974, 5444, 4556, 1650, 2328, 8368, 251, 182, 664, 285, 637, 335, 1931, 704, 2581, 704, 689, 182, 665, 679, 1604, 682, 396, 470, 1006, 206, 2820, 209, 2037, 437, 1792, 440, 1795, 496] playback_via_arduino_1 = [4362, 4610, 482, 637, 446, 709, 473, 1813, 420, 1837, 421, 689, 441, 1838, 469, 662, 417, 721, 433, 680, 452, 685, 467, 1819, 443, 1827, 453, 664, 444, 1805, 504, 659, 444, 688, 440, 1817, 470, 1788, 445, 1817, 469, 662, 442, 1865, 419, 689, 455, 656, 462, 683, 463, 688, 419, 714, 443, 710, 418, 1845, 440, 686, 417, 1870, 415, 1841, 439, 1820, 451] playback_via_arduino_2 = [4361, 4604, 445, 738, 410, 717, 437, 1821, 503, 1755, 418, 711, 418, 1850, 499, 620, 447, 686, 441, 711, 420, 735, 399, 1838, 468, 1789, 465, 690, 439, 1792, 471, 737, 435, 666, 421, 1841, 443, 1791, 463, 1823, 440, 688, 438, 1805, 488, 682, 417, 685, 466, 715, 439, 661, 470, 691, 416, 710, 418, 1842, 485, 672, 420, 1797, 483, 1815, 419, 1842, 443]

As you can see if you compare the original data to the playback it is inconsistent on the Pi Pico W, but consistent on the Arduino.

My testing suggests that the recording of the data is fine. The issue is with the playback.

The code I use is this:

from ir_tx import Player from machine import Pin

volumeupbutton = [4502, 4526, 518, 606, 518, 602, 518, 1722, 542, 1702, 518, 602, 518, 1726, 542, 578, 518, 602, 518, 606, 518, 602, 518, 1722, 546, 1698, 518, 602, 518, 1722, 546, 578, 518, 602, 518, 1722, 546, 1698, 518, 1722, 542, 582, 514, 1726, 542, 578, 518, 602, 522, 602, 518, 602, 518, 606, 518, 602, 518, 1722, 546, 578, 518, 1722, 542, 1698, 518, 1726, 542, 1000]

ir = Player(Pin(17, Pin.OUT, value = 0)) ir.play(volumeupbutton)

Also here is the lengths of the data:

original_data = 68 playback_1 = 67 playback_2 = 65 playback_3 = 61 playback_via_arduino_1 = 67 playback_via_arduino_2 = 67

Any ideas would be appreciated greatly.

Many thanks, Jonathan Adshead


Edit:

I have tried the alternate method of address & data for NEC Samsung/NEC 16 bit.

testing both test(1) and test(8) with the Arduino and original remote control shows a readable address and data, but if I use the below code (either with or without setting the NEC.samung bool) it results in a response of Error: bad block & Invalid start pulse.

from machine import Pin from ir_tx.nec import NEC

powerbutton = 0x1e functionbutton = 0x8a volumeupbutton = 0x17 volumedownbutton = 0x16 volumemutebutton = 0x1f

nec = NEC(Pin(17, Pin.OUT, value = 0)) NEC.samsung=True addr = 0x2c2c

nec.transmit(addr, powerbutton)

peterhinch commented 11 months ago

Firstly I agree that the start sequence indicates Samsung (Samsung starts with 4.5ms on, followed by 4.5ms off). The burst comprises 16 bits of address and 16 bits of data (32 bits, each comprising an ON pulse and an OFF pulse) and a start sequence of one ON and two OFF. So the correct sequence length is 66. Your original data is 68, the Arduino reports 67, and your playback comes in at 67, 65 and 61.

Please can you clarify your test setup. You need to transmit with the code above:

from machine import Pin
from ir_tx.nec import NEC

powerbutton = 0x1e
functionbutton = 0x8a
volumeupbutton = 0x17
volumedownbutton = 0x16
volumemutebutton = 0x1f

nec = NEC(Pin(17, Pin.OUT, value = 0))
NEC.samsung=True  # This is essential otherwise your start sequence will be wrong.
addr = 0x2c2c

nec.transmit(addr, powerbutton)

If this is producing inconsistent results there is something wrong with the transmission path. You need to ensure that your receiver chip is designed for the correct modulation frequency (38KHz) and that you have a clear path between transmitter and receiver.

I take it that you are running the receiver with

from ir_rx.nec import SAMSUNG
jonadshead commented 11 months ago

Hi Peter,

Thanks for the response. The receiver and transmitter where tested at about 2 inches apart with nothing in the way. The transmitter can produce 38khz signal, I have been using it for a while in my v1 build on the Arduino. Yesterday I managed to get it working using this other person's code with no issues: https://github.com/mgbaozi/picoir (sorry, but I could not figure out the issue that causes your code to not send consistently)

The receiver code was:

from ir_rx.acquire import test test(8) #test(1)

And the raw capture code was:

from ir_rx.acquire import test import ujson

lst = test() # May report unsupported or unknown protocol with open('burst.py', 'w') as f: ujson.dump(lst, f)

Many thanks, Jonathan Adshead

peterhinch commented 11 months ago

I'm puzzled. Reading the code comments of the picoir code it produces the NEC protocol (leader 9ms ON, 4.5ms OFF) yet your samples in the original post are Samsung (4.5ms ON, 4.5ms OFF). An oddity is that picoir claims to use a 50KHz carrier, when all the protocols I'm aware of are in the range 36KHz to 40KHz.

I really am baffled as to what is going on, but I do think you need to establish beyond doubt which protocol is being used by the remote you are aiming to emulate.

Secondly, I can't see a reason for great variability between runs other than a hardware cause. If you can get access to an oscilloscope, looking at the output of the receiver chip might pay dividends.

ThinkTransit commented 3 months ago

I have noticed something similar when using the Pico W. In my case I am recording and replaying raw bursts from proprietary A/C remotes.

Acquiring is always fine however when replaying, there can be random disruptions to the transmissions.

Some of them are really obvious like this, where there is a large interruption to the transmission and the output is stuck high. Green trace is what is sent out from the Pico image

Sometimes it gets stuck low image

Sometimes it is only a minor deviation but still enough to break the protocol. I'm assuming it is some form of internal interrupt that is blocking micropython.

In my case I'm using the following code to transmit.

@app.route('/api/replay/<path:path>')
async def api(request, path):

    on_burst = data[path]

    ir_player = Player(ir_tx_pin, asize=len(on_burst))
    ir_player.play(on_burst)

@peterhinch Would be interested to know if you have any ideas on what could be investigated regarding this. Could wifi or bluetooth on the Pico cause interruptions like this?

Thanks

peterhinch commented 3 months ago

This is an interesting observation. I haven't tested transmission running concurrently with WiFi.

On the Pico the file rp2_rmt.py uses the PIO to replicate the ESP32 RMT device. This code relies on an interrupt. It is possible that concurrent WiFi operations are causing higher priority ISR's to run, causing a delay in servicing the PIO interrupt. I'll give this some thought re possible solutions - it's a while since I worked on the PIO code and I'll have to refresh my memory.

An interesting test might be to impose a delay between receiving the request and initiating the transmission in the hope that WiFi activity has stopped.

ThinkTransit commented 3 months ago

Thanks Peter, I can confirm it's definitely impacted by wifi however the problem is compounded by the fact that in my case the Pico is very busy and the pulses I'm sending are longer than the standard remote codes.

When I send commands using the Player class from the repl I get over 99% accuracy however when running the identical code in the body of my application it is less than 10%.

A suggestion on the forum was to try using the DMA to feed the bursts to the PIO so I'm investigating that also.

peterhinch commented 3 months ago

I think DMA should provide a solution. To put this in context, a typical IR on or off time is 500μs. The FIFO holds four periods and the driver tries to keep the FIFO full. Hence the design assumes an IRQ latency of under 2ms. On a bare metal port with hard IRQ's this seemed a very conservative assumption.

Another approach is based on increasing FIFO capacity. Chaining the FIFOs and storing two half-words per entry might take the latency tolerance to 8ms. It's likely that DMA is best in that an entire burst (regardless of length) can be guaranteed to be sent at speed, with IRQ latency affecting only the gaps between bursts rather than their contents.

You have measured 30ms, a truly shocking degree of latency. I wonder if this should be considered a bug. Do you think an issue is appropriate, and if so, do you want to raise it?

ThinkTransit commented 3 months ago

Not sure if it's a bug but definitely a limitation. I will raise an issue and if I get DMA working I will submit a PR for you.

peterhinch commented 3 months ago

Two further observations:

rp2_rmt.py bug

By default it is using soft IRQ's. Please could you test again with this line amended to read:

rp2.PIO(0).irq(self._cb, trigger=sm_no, hard=True)

It is possible that my inadvertent use of soft IRQ's is the root of the problem (in which case there is no issue to be raised).

DMA

See this doc.

I am no longer confident that DMA will fix the problem. As you know, the driver populates the FIFO with ON and OFF durations in μs. In the normal case of (say) an NEC instance, the irqtrain state machine runs. It simply generates an IRQ at the required time to generate an edge. The ISR alters the duty cycle of a PWM pin, alternating between typically 30% and zero. The PWM frequency is that of the carrier.

We have been assuming that the gaps in transmission are caused by the FIFO being starved as a result of ISR latency. DMA will ensure that the FIFO will be kept full, but extreme ISR latency will still garble the output by causing long bursts of carrier or long gaps.

If hard IRQ's don't fix the latency problem, there is a possible solution involving hardware. The other PIO script does not generate a carrier. It creates a pulse train which can be used to gate a continuous carrier. The driver can be adapted to run a PWM pin at the carrier frequency, and use a second transistor in the driver circuit so that the IR LED only comes on if both pins are high.

ThinkTransit commented 3 months ago

Hi @peterhinch

Great news, adding the hard IRQ's seems to have resolved the problem, thanks for picking that up.

When I added your suggested line it did stop all IR transmissions however modifying it to the following seemed to work, not sure why.

rp2.PIO(0).irq(handler=self._cb, hard=True)

In relation to the DMA, thanks for the explanation, I was struggling to workout how the irqtrain works but makes sense now.

I wonder if the PIO could turn the PWM on/off by writing to a specific memory location that controls the PWM? The driver circuit is another great suggestion.

ThinkTransit commented 3 months ago

Small thing but is it fair to say that pin_pulse can't be used at the moment because it isn't a parameter on init?

class IR:
    _active_high = True  # Hardware turns IRLED on if pin goes high.
    _space = 0  # Duty ratio that causes IRLED to be off
    timeit = False  # Print timing info

    @classmethod
    def active_low(cls):
        if ESP32:
            raise ValueError('Cannot set active low on ESP32')
        cls._active_high = False
        cls._space = 100

    def __init__(self, pin, cfreq, asize, duty, verbose, sm_no=0):
        if ESP32:
            self._rmt = RMT(0, pin=pin, clock_div=80, tx_carrier = (cfreq, duty, 1))
            # 1μs resolution
        elif RP2:  # PIO-based RMT-like device
            self._rmt = RP2_RMT(pin_pulse=None, carrier=(pin, cfreq, duty), sm_no=sm_no)  # 1μs resolution
            asize += 1  # Allow for possible extra space pulse
        else:  # Pyboard
            if not IR._active_high:
                duty = 100 - duty
            tim = Timer(2, freq=cfreq)  # Timer 2/pin produces 3
peterhinch commented 3 months ago

You are correct about pin_pulse. The original design used this with the two transistor driver as I described. Then I thought of switching the duty ratio of a PWM and adopted this throughout. I retained the option of generating an arbitrary pulse train as an undocumented feature which might be useful, possibly in a different application. I will add some code comments to clarify this. I don't want to add this to the IR constructor as none of the subclasses would use it.

Re the irq line I'll accept your PR as it's been tested, but looking at the docs and the source I'm puzzled as to why mine didn't work. I'll investigate.

I wonder if the PIO could turn the PWM on/off by writing to a specific memory location that controls the PWM? The driver circuit is another great suggestion.

If, as you say, hard IRQ's fix the problem, what would this achieve?

Apologies for this bug. I was firmly of the view that I was using hard IRQ's. It always pays to revisit code with a critical cast of mind :)

ThinkTransit commented 3 months ago

Re the irq line I'll accept your PR as it's been tested, but looking at the docs and the source I'm puzzled as to why mine didn't work. I'll investigate.

I'm not sure why it doesn't work according to the docs either, maybe user error on my part, will be interesting to see if it works for you.

If, as you say, hard IRQ's fix the problem, what would this achieve?

Apologies for this bug. I was firmly of the view that I was using hard IRQ's. It always pays to revisit code with a critical cast of mind :)

True I will leave good enough alone.

No worries your library is very robust and I have been using it for many years without issue, this is a pretty obscure bug, appreciate your help getting it resolved!

peterhinch commented 3 months ago

Your observation re trigger is correct: it kills the code. The arg and the C code make no sense to me: a value of 0xF00 works. Whatever the arg does, it doesn't match what I thought it did.

I've pushed an update keeping your invocation with added comments and code formatted with Black.

Closing this as complete.

ThinkTransit commented 3 months ago

Your observation re trigger is correct: it kills the code. The arg and the C code make no sense to me: a value of 0xF00 works. Whatever the arg does, it doesn't match what I thought it did.

Thanks Peter should I raise it as an micropython bug?

peterhinch commented 3 months ago

It's up to you. Here are my thoughts on it, but you might like to check the code and docs:

Practical results:

peterhinch commented 3 months ago

OK, I think I've figured out the trigger arg by studying the code and by experiment. If you raise an issue you might want to suggest amending the docs along these lines:

trigger bits 8 to 11 correspond to state machine 0 to 3. If a bit is set, the matching SM can raise an interrupt. Hence 0xF00 allows any SM to trigger an interrupt, 0x100 enables only SM0.

I would submit a docs PR but the whole process has become such a pain that I've rather given up with it.

Re rp2_rmt.py the following works with any SM and ensures you can't get spurious IRQ's from another SM:

rp2.PIO(0).irq(handler=self._cb, trigger=1 << (sm_no + 8), hard=True)