nRF24 / RF24

OSI Layer 2 driver for nRF24L01 on Arduino & Raspberry Pi/Linux Devices
https://nrf24.github.io/RF24
GNU General Public License v2.0
2.2k stars 1.01k forks source link

startListening() calls closeReadingPipe(0) #671

Closed 2bndy5 closed 3 years ago

2bndy5 commented 3 years ago

I don't throw the "bug" label around lightly, but I've been racking my brain and triple checking the datasheet. This actually relates to #496 as it would only happen when the uint8_t array pipe0_reading_address's first byte is 0. In #496, @justinBR pointed out that startListening() executes the following code:

    if (pipe0_reading_address[0] > 0) {
        write_register(RX_ADDR_P0, pipe0_reading_address, addr_width);
    } else {
        closeReadingPipe(0);  // THIS LINE IS A PROBLEM
    }

which stems from maniacBug's initial commit titled "added"

What was the original intention???

I can't understand the intention to close pipe 0 (its power-on-reset value makes it open). For a function whose sole purpose is to enable RX mode, it should not be closing any RX pipes. I realize that pipe 0 is also used for TX-ing ACK packets, but that functionality is automated via the nRF24L01's firmware (see list item 2 on this page). Also, the address for ACK packets is fetched from the RX_ADDR_P# register related to the pipe that received the instigating payload (not the TX_ADDR register -- see also the first paragraph on this page). This is why the nRF24L01 can listen to 6 devices simultaneously instead of 5. I know the docs about openReadingPipe() say:

Pipe 0 is also used by the writing pipe so should typically be avoided as a reading pipe.

Nordic Semiconductors didn't design pipe 0 to be a TX only pipe for good reasons. Additionally, the datasheet never warns against using pipe 0 for RX mode.

  1. Code to reproduce

    uint64_t address = 0xFBFBFBFB00LL;
    // 0xFBFBFBFB00 is a malformed address due to the 0x00 byte
    radio.openReadingPipe(0, address);
    radio.startListening();
  2. Expected behavior After the call to startListening() pipe 0 should be open for receiving, but it gets closed because the pipe's address (& its shadow copy pipe0_reading_address) started with a 0.

  3. What device(es) are you using? Please specify make and model. Doesn't matter. This is a conceptual bug limiting adequate/proper usage of the transceiver.

Suggested Solution

  1. ~Simply remove the else block that closes pipe 0.~ This step negates original intention (see below comment); leave the code as is.
  2. We should also remove the docs' discouragement about using pipe 0 as a reading pipe as I've already stated that pipe 0's usage for TX operations during RX mode is automated in the nRF24L01 firmware (when auto-ack is enabled). Instead the docs about openWritingPipe() should mention how pipe 0 needs to be re-appropriated (using another call to openWritingPipe()) for RX-ing ACK packets if openReadingPipe() and/or startListening() is called after setting the TX address (even if the TX address hasn't changed). The revised docs about openWritingPipe() should also note that the TX address is set to RX pipe 0 despite if the auto-ack feature is disabled (though this behavior is only required when auto-ack is enabled).

Additional context

I want to be clear that this will not solve #496, but it might limit that issue's occurrence. The docs, TMRh20's comment in #496, and the erroneous code at the top clearly state/show why using the integer 0 for the first byte in the address written to pipe 0 is strongly discouraged.

In trying to make sense of this call closeReadingPipe(0), I checked with RadioHead's RH_NRF24.cpp file, and it also doesn't close any RX pipes when enabling RX mode.

Avamander commented 3 years ago

I've already stated that pipe 0's usage for TX operations during RX mode is automated in the nRF24L01 firmware (when auto-ack is enabled)

This does sound like a nice footgun to be honest.

Also is the behaviour consistent with the nRF24L01 non-plus and maybe even clones?

2bndy5 commented 3 years ago

is the behaviour consistent with the nRF24L01 non-plus and maybe even clones?

I'm not sure. Preliminary guess (due to the copy and paste I've seen in those datasheets): Yes. But I have to do more research on that (I don't own any clones to test). It would be nice to have a list of supported clones somewhere in the docs (just to have a clear starting point for that question).

footgun

you mean on maniacBug's behalf? I spent the last 3 hours trying to make sense of this, so my brain is constantly questioning what I know or may not know.

I'm also wondering if this would effect the intention of RF24Network & Rf24Mesh, but I haven't gotten through reading all the code for those.

Avamander commented 3 years ago

you mean on maniacBug's behalf? I spent the last 3 hours trying to make sense of this, so my brain is constantly questioning what I know or may not know.

When someone disables autoack and has to debug that it behaves differently.

2bndy5 commented 3 years ago

per non-plus datasheet, this behavior is consistent

2bndy5 commented 3 years ago

I recently added a commit titled "8 paragraphs that say keep auto-ack enabled" -- its a clarification in the setAutoAck() docs. This actually has nothing to do with auto-ack disabled. Sorry, my brain... I'll be dreaming in binary again.

Avamander commented 3 years ago

It probably should be written in the documentation of the function that disables autoack that it has side-effects. Was that where you added the paragraphs?

2bndy5 commented 3 years ago

here you go. yes for all pipes and individual pipes

2bndy5 commented 3 years ago

Looks like the Si24R1 datasheet says the same thing, but I'm trying to translate that from chinese (I don't read chinese -- my google translate does). Do you know of a copy for that datasheet that's in english?

BK2423 (AKA RFM73) datasheet doesn't even talk about multiceiving (probably because its trademarked by nordic) or how the ack packet gets its address. TMRh20 explicity called that clone "an obscure device", so I'm just not worried about this one.

2bndy5 commented 3 years ago

@Avamander don't forget that this issue only becomes a problem if the first byte is 0 for the address on pipe 0 -- so its mostly a library issue stemming from uint8_t friendly NULL vs 0.

TMRh20 commented 3 years ago

Long thread and again I’m having problems following the discussion, so hopefully this isn’t redundant info, but the TX address is assigned for reading on pipe0 whenever sending happens.

It needs to be closed so that the TX device does not end up listening on the same pipe as the recipient after sending is completed.

On Oct 22, 2020, at 6:59 AM, Brendan notifications@github.com wrote:

 @Avamander don't forget that this issue only becomes a problem if the first byte is 0 for the address on pipe 0 -- so its mostly a library issue stemming from uint8_t friendly NULL vs 0.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

2bndy5 commented 3 years ago

the TX address is assigned for reading on pipe0 whenever sending happens

Yes. Per the datasheet this is required, but only for when auto-ack is enabled. This isn't needed if the TX node doesn't care about ACK packets.

It needs to be closed so that the TX device does not end up listening on the same pipe as the recipient after sending is completed.

Ok, that explains the intention. However, this problem should be avoided with docs, not code. Technically the RF24Network nodes can listen to as many as 6 child nodes as long as the pipe 0 address is re-written to RX_ADDR_P0 register after exiting TX mode. I used the RF24Network as an example to demonstrate the benefit of using all pipes for RX mode.

The following is from my Multiceiver example (to address #658) in which the role variable is a number from 0-5 for TX nodes, or it is a negative number for the RX node (this code tested out fine):

void setRole() {
  if (role < 0) {
    // For the RX node

    // Set the addresses for all pipes to TX nodes
    for (uint8_t i = 0; i < 6; ++i)
      radio.openReadingPipe(i, address[i]);

    radio.startListening(); // put radio in RX mode

  } else {   // BOUNDS CHECKING IS DONE FROM `if (Serial.available())` BLOCK
    // For the TX node

    // set the payload's nodeID & reset the payload's identifying number
    payload.nodeID = role;
    payload.payloadID = 0;

    // Set the address on pipe 0 to the RX node.
    radio.stopListening(); // put radio in TX mode
    radio.openWritingPipe(address[role]);

    // According to the datasheet, the auto-retry features's delay value should
    // be "skewed" to allow the RX node to receive 1 transmission at a time.
    // So, use varying delay between retry attempts and 15 (at most) retry attempts
    radio.setRetries(((role * 3) % 12) + 3, 15); // maximum value is 15 for both args
  }
} // setRole

EDIT: the EN_RXADDR register's open/close status of the pipes (especially pipe 0) does not have any affect during TX mode. Again, pipe 0's usage is dominated by the ESB protocol when sending anything.

In other words, the process for TX mode (assuming auto-ack is enabled):

  1. device enters TX mode
  2. device prepares and transmits a packet based on the TX_ADDR register using pipe 0.
  3. device changes pipe 0's role from sending to listening (using address from RX_ADDR_P0 register) when device is configured to expect an ACK packet
  4. if no ACK is received (and ARC_CNT != SETUP_RETR register's ARC value), then repeat from step 1
  5. repeat this process for all payload's in TX FIFO until ARC_CNT reaches SETUP_RETR register's ARC value (in which case the MAX_RT flag is asserted) or TX FIFO is empty.

The process for RX mode (assuming auto-ack is enabled):

  1. device enters RX mode
  2. if a payload is received on any pipe that is open to do so, then prepare and transmit an ACK packet on pipe 0 using the address from RX_ADDR_Px register about the pipe that received the payload.
  3. save received payload to RX FIFO and return pipe 0 to RX mode (using the address from RX_ADDR_P0 register) to continue listening.
    • if pipe 0 is not opened with EN_RXADDR register, then device closes pipe 0

I feel I've exhausted the ways I can explain this on this thread.

TMRh20 commented 3 years ago

I can't understand the intention to close pipe 0

The original intention was that if the user was not using pipe0 as a reading pipe, the radio would not end up listening on pipe0 using the TX address. This will cause issues with auto-ack if the library does not close the TX RX pipe that is using the address of the previous recipient. This is why the issue should be addressed in code, not docs IMHO.

Its more of a user friendly capability than a necessary requirement I understand, but if not handled in code, the user would need a separate function to close it, or be forced to use pipe0 as a reading pipe, no?

2bndy5 commented 3 years ago

Yes, it would be more user friendly to close the pipe 0 when entering RX mode, especially if the user isn't monitoring what pipe received the payload. However, concerning the code snippet in the OP, this library only closes pipe 0 when the RX address written to pipe 0 has a 0 in the LSByte (malformed address). That means the library isn't actually closing pipe 0 after transitioning from TX mode when a well-formed TX address is used on pipe 0 (which counteracts the original intention you specify).

uint8_t address[2][6] = {"1Node", "2Node"};
radio.openWritingPipe(address[0]);
radio.openReadingPipe(1, address[1]);
radio.stopListening();  // pipe 0 and 1 are open for RX, but only pipe 0 is used in TX mode
radio.openReadingPipe(0, address[0]);
radio.startListening(); // pipe 0 and 1 are open for RX because the address on pipe 0 didn't start with "\0"

I fully appreciate this dialogue because its a great learning opportunity for me!

2bndy5 commented 3 years ago

I just noticed this line at the bottom of stopListening():

write_register(EN_RXADDR, read_register(EN_RXADDR) | _BV(pgm_read_byte(&child_pipe_enable[0]))); // Enable RX on pipe0

This is getting hard to track, I have to do some thought experiments.

TMRh20 commented 3 years ago

However, concerning the code snippet in the OP, this library only closes pipe 0 when the RX address written to pipe 0 has a 0 in the LSByte (malformed address). That means the library isn't actually closing pipe 0 after transitioning from TX mode when a well-formed TX address is used on pipe 0 (which counteracts the original intention you specify).

That is exactly the intention, except the lib should close pipe0 when the RX addy written to pipe0 is malformed OR is not present.

In short, close pipe0 if malformed or not in use when switching to RX mode.

2bndy5 commented 3 years ago

ok I'll leave the closeReadingPipe(0) call in, but I've already expanded the docs to notify the user of the underlying behavior. Is it ok if I leave the modified docs?

updated docs on startListening(), ~stopListening()~, openWritingPipe(), and openReadingPipe()

TMRh20 commented 3 years ago

Yup the docs look good

On Nov 15, 2020, at 8:05 PM, Brendan notifications@github.com wrote:

 ok I'll leave the closeReadingPipe(0) call in, but I've already expanded the docs to notify the user of the underlying behavior. Is it ok if I leave the modified docs?

updated docs on startListening(), stopListening(), openWritingPipe(), and openReadingPipe()

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

2bndy5 commented 3 years ago

if there was a private member that saved the state of the EN_AA register, we could limit this behavior to only when auto-ack is enabled for pipe 0 (similar to the config_reg private member). But that is an issue I'm putting off till after the next release (more likely just for my fork).

combs commented 3 years ago

Forgive me if I am missing something obvious. I'm wondering if this change to startListening() would be an improvement:

     ...
     config_reg |= _BV(PRIM_RX);
     write_register(NRF_CONFIG,config_reg);
     write_register(NRF_STATUS, _BV(RX_DR) | _BV(TX_DS) | _BV(MAX_RT));
     ce(HIGH);
     write_register(RX_ADDR_P0, pipe0_reading_address, addr_width);

     if(ack_payloads_enabled){
         flush_tx();
     }

In other words, always restoring pipe 0 reading address from pipe0_reading_address, instead of trying to sense whether it is set by LSB==0. (pipe0_reading_address is set to 0x0000000000 by the constructor.)

I think this would be a little more "noob friendly" as the possibility of receiving random messages at 0x0000000000 after TX w/AA seems less likely and less problematic to me than a noob trying to use an address with 0x00 LSB and having their pipes stealth-closed.

I've been wrestling with some strange problems with pipe 0 with (barf) Si24R1 modules, and this idea came out of me digging into that. This library is seriously cool and useful, and I hope I'm helping. Please let me know if I'm missing something obvious with this suggestion or if you have any other feedback.

2bndy5 commented 3 years ago

I think this would be a little more "noob friendly" as the possibility of receiving random messages at 0x0000000000 after TX w/AA seems less likely and less problematic to me than a noob trying to use an address with 0x00 LSB and having their pipes stealth-closed.

This actually relates to #496 . Personally, I'd prefer

if (memcmp(pipe0_reading_address, 0x0000000000, addr_width) != 0)
    // Restore pipe 0 address
else
    closeReadingPipe(0);

to better prevent undesired stealthy closure of pipe 0. I understand that this code wouldn't work as is because 0x0000000000 is not an rvalue, but I digress.

The only reason we use pipe0_reading_address is to restore the state of pipe 0 to it's original state when the user hasn't explicitly designated it for RX behavior. Granted, startListening() & stopListening() currently presume that auto-ack is always enabled.

so if they don't close the pipe explicitly, they could receive messages addressed to 0x0000000000

Setting aside the malformation of a zeroed-out address, I don't see how this is advantageous. Did you have a use case in mind for this?

A tiny bit less logic to run on MCUs. But, maybe one more SPI transaction if pipe 0 rx is unused.

I'm not worried about the quantity of logic (if that wasn't obvious from my memcmp() suggestion). I am a bit sensitive about unnecessary SPI transactions because they can slow things down.

combs commented 3 years ago

Ahh, I understand the existing logic more if SPI transactions are costly. I can see how this would reduce throughput to reduce the address each time. Thank you

2bndy5 commented 3 years ago

No problem. I do appreciate these discussions as it provides a great opportunity for learning (for all parties).