jgillula / rpi-rfm69

Python RFM69 library for Raspberry Pi
GNU General Public License v3.0
30 stars 20 forks source link

Automatic ACK blocks interrupt handling #22

Open nawbar23 opened 2 years ago

nawbar23 commented 2 years ago

When automatic ACK mode is turned on, _sendFrame is called from _interruptHandler. _sendFrame than contains: with self._sendLock: ....self._setMode(RF69_MODE_TX) ....self._sendLock.wait(1.0) self._setMode(RF69_MODE_RX) which causes interrupt to hang for 1s and it is not possible to release lock as SEND_DONE interrupt is received only when wait expires.

Note time passed between 'Set mode to: 4' - RF69_MODE_TX and 'Set mode to: 3' - RF69_MODE_RX; Note also that the second interrupt is handled just after previous is finished - after waiting 1s for sending and getting into reception mode (twice :-));

2022-05-08 00:00:36,267 - radio - DEBUG - 1976558656 - Interrupt handler: entering 2022-05-08 00:00:36,267 - radio - DEBUG - 1976558656 - Interrupt handler: Send lock notify_all 2022-05-08 00:00:36,268 - radio - DEBUG - 1976558656 - Set mode to: 1 2022-05-08 00:00:36,270 - radio - DEBUG - 1976558656 - replying to ack request 2022-05-08 00:00:36,271 - radio - DEBUG - 1976558656 - Incoming data packet 2022-05-08 00:00:36,271 - radio - DEBUG - 1976558656 - Interrupt handler: return: Sending an ack 2022-05-08 00:00:36,271 - radio - DEBUG - 1976558656 - Send ACK to: 11 2022-05-08 00:00:36,272 - radio - DEBUG - 1976558656 - Begin receive 2022-05-08 00:00:36,272 - radio - DEBUG - 1976558656 - Set mode to: 3 2022-05-08 00:00:36,274 - radio - DEBUG - 1976558656 - Send frame 2022-05-08 00:00:36,274 - radio - DEBUG - 1976558656 - Set mode to: 1 2022-05-08 00:00:36,276 - radio - DEBUG - 1976558656 - Set mode to: 4 2022-05-08 00:00:37,278 - radio - DEBUG - 1976558656 - Set mode to: 3 2022-05-08 00:00:37,280 - radio - DEBUG - 1976558656 - Begin receive 2022-05-08 00:00:37,281 - radio - DEBUG - 1976558656 - Set mode to: 3 2022-05-08 00:00:37,282 - radio - DEBUG - 1976558656 - Interrupt handler: entering 2022-05-08 00:00:37,282 - radio - DEBUG - 1976558656 - Interrupt handler: Send lock notify_all 2022-05-08 00:00:37,283 - radio - DEBUG - 1976558656 - Interrupt handler: return: RF69_MODE_RX && RF_IRQFLAGS2_PAYLOADREADY

YIS001 commented 2 years ago

This is an issue I am also having if many devices (i.e 5 devices) are transmitting every 10 seconds.

MalSmalley commented 1 year ago

I've been having a similar problem. I'm working on getting RFM69 working on a RPi Pico and have the pico communicating with a 'grown up RPi. Getting a hang up when an incoming message requires an ACK to be sent while an outgoing message is being sent.

2023-07-05 20:02:08,168 - RFM69.radio - DEBUG - 1984828480 - Interrupt (rx)Incoming Message 2023-07-05 20:02:08,169 - RFM69.radio - DEBUG - 1984828480 - Interrupt -have intLock (rx)Acquires intLock Sending (tx)Send Message start 2023-07-05 20:02:08,170 - RFM69.radio - DEBUG - 1995557248 - _Send entry (tx)in _send routine 2023-07-05 20:02:08,171 - RFM69.radio - DEBUG - 1984828480 - replying to ack request (rx)Msg read, req Ack 2023-07-05 20:02:08,171 - RFM69.radio - DEBUG - 1995557248 - _Send have sendLock (tx)Acquires sendLock 2023-07-05 20:02:08,171 - RFM69.radio - DEBUG - 1984828480 - Incoming data packet (rx)From interrupt routine 2023-07-05 20:02:08,172 - RFM69.radio - DEBUG - 1995557248 - _Send checking canSend (tx)calling _canSend 2023-07-05 20:02:08,173 - RFM69.radio - DEBUG - 1995557248 - _canSend entry caller:send mode:1 (tx)in _canSend 2023-07-05 20:02:08,173 - RFM69.radio - DEBUG - 1995557248 - begin_receive - entry (tx)call to begin_receive but stalls as can't acquire intLock 2023-07-05 20:02:08,183 - RFM69.radio - DEBUG - 1984828480 - Sending an ack (rx)Sending ack 2023-07-05 20:02:08,183 - RFM69.radio - DEBUG - 1984828480 - Interrupt (ack)- released intLock (rx)release intLock

line 637 of current build

Send acknowledgement if needed

            if ack_requested and self.auto_acknowledge:
                self._debug("Sending an ack")
                **self._intLock.release()**
                self.send_ack(sender_id)
                self.begin_receive()
                return

2023-07-05 20:02:08,184 - RFM69.radio - DEBUG - 1995557248 - begin_receive - have intLock (tx)gets released intLock 2023-07-05 20:02:08,184 - RFM69.radio - DEBUG - 1984828480 - Send_ack entry (rx)in Send_ack
2023-07-05 20:02:08,185 - RFM69.radio - DEBUG - 1984828480 - Send_ack have sendLock (rx)acquires sendLock 2023-07-05 20:02:08,186 - RFM69.radio - DEBUG - 1984828480 - _canSend entry caller:sendAck mode:1 (rx)in _canSend 2023-07-05 20:02:08,186 - RFM69.radio - DEBUG - 1984828480 - begin_receive - entry (rx)call to begin_receive
but stalls as can't acquire intLock

So now we have a stalemate. The Rx process has sendLock but the Tx process has stolen the intLock when it got released so neither can proceed.

The solution, I think.

begin_receive has 'with self._intLock' but it does not need intLock. Its only reading the registers and intLock does not prevent these getting changed as this is happening in the RFM69 this lock can be removed. the sendLock prevents the interference between the two outgoing messages.

So 1) Remove this with clause and (2) move the intLock.release in the ACK clause to after the self.send_ack so the interrupt retains the intLock until Ack is sent resolves this mutex

I actually can't see why _canSend needs to call begin_receive anyway. Removing this from _canSend may be cleaner.

Seems to be robust after this but I'm still testing.

Not needed for this fix but the SendLock.NotifyAll at the start of the interrupt handler is not needed. SendLock never calls a wait or wait_for so it can be downgraded from a Condition to a simple Lock and this can be removed.

jgillula commented 1 year ago

Thank you so much for the detailed analysis! Unfortunately I won't have the time to work on this myself in the near future (since this is just a hobby project and some other life priorities have arisen), but if you'd like to submit a pull request I'd be very happy to accept it.