Closed matthijskooijman closed 1 year ago
In general, I wonder if enableInterrupt is really needed at all, though.
You're corect, it is not needed. I think the original idea was in Receive_Interrupt, to make sure reception of a second packet doesn't start before the previous one was processed. But that doesn't really make sense, since to receive a packet, the device would first have to set back to Rx mode e.g. by calling startReceive()
, which obviously cannot happen until the first packet is processed. I guess I then cargo-culted this flag variable into every other interrupt example.
I tested all the examples for SX127x without the enable flag, and they are working correctly. I was also able to force the race condition by including a delay, and fix it by removeing the enable flag.
Thanks for reporting this, I will remove the enable flag from all the examples.
All removed now, thanks for reporting.
Describe the bug The interrupt examples use an
enableInterrupt
variable to suppress interrupt handling while the result of a previous interrupt is handled and a new request (TX/RX/CAD/etc) is being configured. Typically, the flow is something like:Here, a race condition occurs when the IRQ is triggered after the call to e.g.
startChannelScan()
and beforeenableInterrupt
is set again. Usually this does not happen, but since timings like these can vary in different circumstances, there is a window for problems. And if this problem occurs, the IRQ will not be handled and the code will lock up, indefinitely waiting for the IRQ that already happenedd.On the STM32WL with CAD I've ran into this problem in practice. I've seen that the IRQ happens after around 20ms or so (so that's a lot quicker than a TX and, in most cases an RX interrupt) and (for some reason I have not investigated), the "print state" step is usually fast but consistently takes around 50ms on the 18th (or so) try.
To confirm this is wat happened, I applied the following patch:
Which produced the following trace on my logic analyzer:
In other words: slowness in serial printing (which might happen for various reasons, especially if USB Serial is involved) causes the code to lock up.
To Reproduce This can be easily reproduced by taking e.g. the SX126x_Channel_Activity_Detection_Interrupt example and adding a delay to fake slow serial printing:
Alternatively, slow down serial printing by repeating "success" a few times (this depends on the board and serial driver, I've tested this on AVR HardwareSerial (Arduino Uno), it might not work on native USB serial ports or serial ports with larger buffers).
With either of these patches, the sketch can do CAD once and then locks up:
To fix To fix, the
enableInterrupt = true
line should be moved up. At least before the serial printing, but that still leaves a small race condition (e.g. when some other IRQ happens at the right moment that takes 20ms). To really fix this,enableInterrupt
must be before the call tostartChannelScan()
, but I'm not sure if this would introduce other problems.In general, I wonder if
enableInterrupt
is really needed at all, though. If the hardware guarantees that only one IRQ is triggered for every operation requested, then I think there would be no need to suppress interrupts? But I guess it was introduced for a reason....