mysensors / MySensors

MySensors library and examples
https://www.mysensors.org
1.32k stars 895 forks source link

Software ack reponse too quick on nRF24 #1134

Open Yveaux opened 6 years ago

Yveaux commented 6 years ago

MySensors Version: 2.2.0 Hardware:

What I'm doing: Bench marking different gateways (currently a STM32 Serial & Ehternet MQTT)

Problem Encountered: Sending loads of messages from the node to the gateway. Each message send from the node requests an ack'd from the gateway.

Expected Cause:

According to the nRF24 datasheet (section 7.9) there must at least be 130us between the completion of the transmitted message and the hardware ack. response. On top of the 130us we also need some time to transmit the actual ack message. This total time must at < ARD (Auto Retransmit Delay), which is configures to be 1500us for MySensors.

So in MyTransport.cpp in transportProcessMessage just before the (void)transportSendRoute(_msgTmp) I added a delay of 1500uS and it all worked reliably again (successful hardware and software ack's and significant reduction of auto re-transmits of 16 times). This delay can actually be smaller than the 1500us as there is of course some time consumed by all the other tasks (still appeared to work properly with only 150us).

Swiftnesses commented 6 years ago

This explains some of my issues, hope it can be implemented soon!

Yveaux commented 6 years ago

Confirmed issue, using blue pill and atmega328 node.

If gateway does not have MY_DEBUG defined the ARC count on the node goes up quickly.

Very simple workaround is to add delayMicroseconds(1500) at the start of RF24_sendMessage() in RF24.cpp. This assures any hardware acks have been sent before a new packet is sent.

If the delay is moved just after RF24_stopListening() the ARC count goes up again.

cimba007 commented 6 years ago

Just for reference .. I had a similar timing issue with a fast gateway and slow nodes

https://github.com/mysensors/MySensors/issues/578

mfalkvidd commented 5 years ago

Nice work Yveaux.

This seems to be the relevant part of the datasheet: image

Doesn't this mean that we can check the TX_DS bit in the STATUS register for successful transmission and MAX_RT bit in the STATUS register for failed transmission? If IRQ is used, it will be triggered when either TX_DS or MAX_RT are set, so there is no need to delay for an arbitrary amount of time.

When checking the code, I see that we are in fact checking TX_DS and MAX_RT, but we are not using IRQ and the timeout is hardcoded to 65535 executions of the loop with the comment // timeout value after successful TX on 16Mhz AVR ~ 65500, i.e. msg is transmitted after ~36 loop cycles.

This timeout will certainly be too short on fast mcus.

mfalkvidd commented 5 years ago

Sorry for the confusion. The spi speed is 2MHz regardless of mcu so this timeout doesn't change unless the sketch developer overrides spi speed. You can disregard my previous comment.