nRF24 / RF24Gateway

TCP/IP (RF24Ethernet) and RF24Network Gateway
http://nRF24.github.io/RF24Gateway
GNU General Public License v2.0
48 stars 37 forks source link

Attempting MQTT connection...failed, rc=-2 try again in 5 seconds #13

Closed SimonMerrett closed 3 years ago

SimonMerrett commented 3 years ago

Hi, I realise I'm clutching at straws here but I'm starting to run out of options. Before the recent releases in July and August I set up a link from an Arduino Nano to mqtt2zigbee running on a Pi 4 with Raspbian desktop. I recently tried to port this over to an Arduino Nano and a Pi 3 A+ on Raspbian light. I used the most recent mqtt2zigbee, mosquitto and couldn't get things working with the ncurses, gatewayNode or ncursesInt to start with. After updating the pubsubclient and RF24 etc libraries for the Arduino, I can get intermittent connections but I get long periods of Attempting MQTT connection...failed, rc=-2 try again in 5 seconds.

I have tried going back to the original setup with the Pi 4 and the Nano but the Nano is now working from the latest RF24 etc version so it isn't exactly the same (same intermittent connection dropping and failure to reconnect). I have interrupt errors on the ncursesInt example for every attempted packet and this is using the same hardware connections that were working when testing my last issue (ie no changes). I did go through the example and explicitly made sure interrupts were turned back on with gw.interrupts(1) rather than gw.interrupts() - like I said, running out of things to check/tweak.

I notice that the installer now gives options for Pi SPI drivers which weren't there before. I selected SPIdev for compatibility on the A+ but I don't know if that's what would have been installed by default on the pre-Jul update version on the Pi 4.

I acknowledge that this could be a problem with mosquitto or pubsubclient but given the raft of recent changes to RF24, I thought it would at least be worth asking if there are any thoughts on this. Even confirming that the process followed in the guide (https://tmrh20.blogspot.com/2019/05/automationiot-with-nrf24l01-and-mqtt.html) still results in reliable performance would be really helpful. I set up mosquitto local service following https://randomnerdtutorials.com/how-to-install-mosquitto-broker-on-raspberry-pi/ .

Not sure what else to include/mention at this stage because it is such a tricky issue to localise. Please shout for more detail that might be helpful.

TMRh20 commented 3 years ago

Thanks for the detailed description, it makes identifying potential problems a lot easier.

First I guess is that if you are getting interrupt errors on every attempted packet, interrupts are not working. You should be able to confirm this by just using teh standard ncurses example instead of ncursesInt. The most likely causes of interrupts not working is of course a: the pin not being connected b: the interrupt pin not specified correctly in the example c: wiringPi not installed (the RF24 lib uses wiringPi for interrupt handling) wiringPi is not installed by default on RPI OS light

In the mqtt client examples, you will notice the line char clientID[] = {"arduinoClient "}; which specifies a unique identifier for the client. If two devices are using the same ID they will be constantly disconnected from MQTT.

The BCM2835 driver was the previous default for RF24, so you may want to try with that as well. WIll need to recompile all libraries and restart SPI/reboot after changing driver.

If you try the following:

  1. Stop your RF24Gateway master
  2. Delete the dhcplist.txt file from the RF24Gateway example directory you are using as master.
  3. Restart your Arduino node(s) then the gateway
  4. Do they show up in the address list relatively quickly?
  5. Can you ping them from the gateway?

If they don't show up in the list, then I would suggest troubleshooting basic functionality with RF24 core examples. If they show up in the list, but can't connect to MQTT, can you ping them?

If not able to ping: I would run an RF24Mesh example on the Arduino and see if the Gateway (no interrupt version) shows user packets being received. If it gets RF24Mesh user packets, then RF24 and RF24Mesh are working fine, it might be a MQTT or Gateway issue. If no user packets, again troubleshoot/test with RF24 core examples.

hope this helps, let me know,

TMRh20

On Wed, Aug 26, 2020 at 11:00 AM SimonMerrett notifications@github.com wrote:

Hi, I realise I'm clutching at straws here but I'm starting to run out of options. Before the recent releases in July and August I set up a link from an Arduino Nano to mqtt2zigbee running on a Pi 4 with Raspbian desktop. I recently tried to port this over to an Arduino Nano and a Pi 3 A+ on Raspbian light. I used the most recent mqtt2zigbee, mosquitto and couldn't get things working with the ncurses, gatewayNode or ncursesInt to start with. After updating the pubsubclient and RF24 etc libraries for the Arduino, I can get intermittent connections but I get long periods of Attempting MQTT connection...failed, rc=-2 try again in 5 seconds.

I have tried going back to the original setup with the Pi 4 and the Nano but the Nano is now working from the latest RF24 etc version so it isn't exactly the same (same intermittent connection dropping and failure to reconnect). I have interrupt errors on the ncursesInt example for every attempted packet and this is using the same hardware connections that were working when testing my last issue (ie no changes). I did go through the example and explicitly made sure interrupts were turned back on with gw.interrupts(1) rather than gw.interrupts() - like I said, running out of things to check/tweak.

I notice that the installer now gives options for Pi SPI drivers which weren't there before. I selected SPIdev for compatibility on the A+ but I don't know if that's what would have been installed by default on the pre-Jul update version on the Pi 4.

I acknowledge that this could be a problem with mosquitto or pubsubclient but given the raft of recent changes to RF24, I thought it would at least be worth asking if there are any thoughts on this. Even confirming that the process followed in the guide ( https://tmrh20.blogspot.com/2019/05/automationiot-with-nrf24l01-and-mqtt.html) still results in reliable performance would be really helpful. I set up mosquitto local service following https://randomnerdtutorials.com/how-to-install-mosquitto-broker-on-raspberry-pi/ .

Not sure what else to include/mention at this stage because it is such a tricky issue to localise. Please shout for more detail that might be helpful.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nRF24/RF24Gateway/issues/13, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAT5KHEN45BAMMCW6JWNIBTSCU5T3ANCNFSM4QMBQVIQ .

SimonMerrett commented 3 years ago

Brilliant pointers - thanks. I will probably start with a fresh install of RaspPi OS with desktop on the A+ as it will make setup and reinstalling everything faster. I'm definitely going to try the original SPI drivers too. Then I can run through the checklist. I have been deleting the dhcplist.txt files every now and then. Sometimes the node is quickly detected, other times not. Oh yes, I have also turned the Pi examples and Arduino code to radio.setPALevel(RF24_PA_MIN) as the antennas are only about a metre apart for testing.

SimonMerrett commented 3 years ago
pi@raspberrypi:~ $ ping 10.10.3.5
PING 10.10.3.5 (10.10.3.5) 56(84) bytes of data.
64 bytes from 10.10.3.5: icmp_seq=1 ttl=64 time=33.3 ms
64 bytes from 10.10.3.5: icmp_seq=2 ttl=64 time=30.1 ms
64 bytes from 10.10.3.5: icmp_seq=3 ttl=64 time=44.4 ms
^C
--- 10.10.3.5 ping statistics ---
36 packets transmitted, 3 received, 91.6667% packet loss, time 332ms
rtt min/avg/max/mdev = 30.075/35.936/44.395/6.131 ms
pi@raspberrypi:~ $ ping 10.10.3.5
PING 10.10.3.5 (10.10.3.5) 56(84) bytes of data.
64 bytes from 10.10.3.5: icmp_seq=9 ttl=64 time=1022 ms
64 bytes from 10.10.3.5: icmp_seq=42 ttl=64 time=29.7 ms
64 bytes from 10.10.3.5: icmp_seq=44 ttl=64 time=28.9 ms
64 bytes from 10.10.3.5: icmp_seq=46 ttl=64 time=23.3 ms
64 bytes from 10.10.3.5: icmp_seq=52 ttl=64 time=24.1 ms
64 bytes from 10.10.3.5: icmp_seq=54 ttl=64 time=14.0 ms
64 bytes from 10.10.3.5: icmp_seq=56 ttl=64 time=20.1 ms
64 bytes from 10.10.3.5: icmp_seq=57 ttl=64 time=36.2 ms
64 bytes from 10.10.3.5: icmp_seq=58 ttl=64 time=34.7 ms
64 bytes from 10.10.3.5: icmp_seq=60 ttl=64 time=24.4 ms
64 bytes from 10.10.3.5: icmp_seq=61 ttl=64 time=25.5 ms
64 bytes from 10.10.3.5: icmp_seq=62 ttl=64 time=22.2 ms
64 bytes from 10.10.3.5: icmp_seq=63 ttl=64 time=20.5 ms
64 bytes from 10.10.3.5: icmp_seq=64 ttl=64 time=22.1 ms
64 bytes from 10.10.3.5: icmp_seq=66 ttl=64 time=29.3 ms
64 bytes from 10.10.3.5: icmp_seq=68 ttl=64 time=29.7 ms
64 bytes from 10.10.3.5: icmp_seq=71 ttl=64 time=15.8 ms
^C
--- 10.10.3.5 ping statistics ---
71 packets transmitted, 17 received, 76.0563% packet loss, time 644ms
rtt min/avg/max/mdev = 14.027/83.679/1022.149/234.688 ms

for an example of what happens to ping as the node stops being connected. BTW, the decent ping times in the second run are still while the Arduino Zero is failing mqtt connection. While I have been typing this message it has reconnected! Just so sporadic.

TMRh20 commented 3 years ago

Hmm, I'm running 15 nodes right now, 9 mqtt clients publishing every 1 to 3 seconds, 2 clients running LEDs, 1 http client, 1 server, 2 RPi pinging the master, so not sure why it would be so spotty with 1 node.

Have you tried a different radio channel? Usually 1 and 50 are pretty open in my area. Master, channel 50: uint8_t nodeID=0; gw.begin(nodeID,50);

Nodes, channel: mesh.begin(50);

On Wed, Aug 26, 2020 at 11:35 AM SimonMerrett notifications@github.com wrote:

pi@raspberrypi:~ $ ping 10.10.3.5 PING 10.10.3.5 (10.10.3.5) 56(84) bytes of data. 64 bytes from 10.10.3.5: icmp_seq=1 ttl=64 time=33.3 ms 64 bytes from 10.10.3.5: icmp_seq=2 ttl=64 time=30.1 ms 64 bytes from 10.10.3.5: icmp_seq=3 ttl=64 time=44.4 ms ^C --- 10.10.3.5 ping statistics --- 36 packets transmitted, 3 received, 91.6667% packet loss, time 332ms rtt min/avg/max/mdev = 30.075/35.936/44.395/6.131 ms pi@raspberrypi:~ $ ping 10.10.3.5 PING 10.10.3.5 (10.10.3.5) 56(84) bytes of data. 64 bytes from 10.10.3.5: icmp_seq=9 ttl=64 time=1022 ms 64 bytes from 10.10.3.5: icmp_seq=42 ttl=64 time=29.7 ms 64 bytes from 10.10.3.5: icmp_seq=44 ttl=64 time=28.9 ms 64 bytes from 10.10.3.5: icmp_seq=46 ttl=64 time=23.3 ms 64 bytes from 10.10.3.5: icmp_seq=52 ttl=64 time=24.1 ms 64 bytes from 10.10.3.5: icmp_seq=54 ttl=64 time=14.0 ms 64 bytes from 10.10.3.5: icmp_seq=56 ttl=64 time=20.1 ms 64 bytes from 10.10.3.5: icmp_seq=57 ttl=64 time=36.2 ms 64 bytes from 10.10.3.5: icmp_seq=58 ttl=64 time=34.7 ms 64 bytes from 10.10.3.5: icmp_seq=60 ttl=64 time=24.4 ms 64 bytes from 10.10.3.5: icmp_seq=61 ttl=64 time=25.5 ms 64 bytes from 10.10.3.5: icmp_seq=62 ttl=64 time=22.2 ms 64 bytes from 10.10.3.5: icmp_seq=63 ttl=64 time=20.5 ms 64 bytes from 10.10.3.5: icmp_seq=64 ttl=64 time=22.1 ms 64 bytes from 10.10.3.5: icmp_seq=66 ttl=64 time=29.3 ms 64 bytes from 10.10.3.5: icmp_seq=68 ttl=64 time=29.7 ms 64 bytes from 10.10.3.5: icmp_seq=71 ttl=64 time=15.8 ms ^C --- 10.10.3.5 ping statistics --- 71 packets transmitted, 17 received, 76.0563% packet loss, time 644ms rtt min/avg/max/mdev = 14.027/83.679/1022.149/234.688 ms

for an example of what happens to ping as the node stops being connected. BTW, the decent ping times in the second run are still while the Arduino Zero is failing mqtt connection. While I have been typing this message it has reconnected! Just so sporadic.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nRF24/RF24Gateway/issues/13#issuecomment-681022305, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAT5KHEAGB47DACSXGKGXODSCVBU5ANCNFSM4QMBQVIQ .

SimonMerrett commented 3 years ago

Just a quick update. I changed to channel 50 without doing any RF survey and moved the Arduino Zero node away from the desktop with laptop, raspi, mobile phone etc. There seemed to be a significant improvement in "uptime" but I don't have the serial monitor available away from the desktop and am only able to monitor with RF24Gateway_ncurses packet flow. There was a point where the gateway wasn't receiving anything and a power cycle on the Zero didn't help - it took ctrl+c and restarting RF24Gateway_ncurses but it did get going again. I will try and do better testing soon. I'm considering going to RF24_250KBPS to see if that helps.

TMRh20 commented 3 years ago

Couple things a: Make sure to update all calls to mesh.begin(50); including lower down in the gateway master example else it may restart on the wrong channel b: regarding teh failure handling part where the mesh is restarted, I believe there is just a problem with the milis() functionality on RPi where it is not consistent and/or jumps around in value causing false detections of errors c: the aforementioned issue appears to be more prevalent at 250Kbps d: I am/will be looking at ways to remedy the situation.

On Thu, Aug 27, 2020 at 3:26 PM SimonMerrett notifications@github.com wrote:

Just a quick update. I changed to channel 50 without doing any RF survey and moved the Arduino Zero node away from the desktop with laptop, raspi, mobile phone etc. There seemed to be a significant improvement in "uptime" but I don't have the serial monitor available away from the desktop and am only able to monitor with RF24Gateway_ncurses packet flow. There was a point where the gateway wasn't receiving anything and a power cycle on the Zero didn't help - it took ctrl+c and restarting RF24Gateway_ncurses but it did get going again. I will try and do better testing soon. I'm considering going to RF24_250KBPS to see if that helps.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nRF24/RF24Gateway/issues/13#issuecomment-682199417, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAT5KHCLZ4QH5AMZU3TPPXLSC3FPLANCNFSM4QMBQVIQ .

SimonMerrett commented 3 years ago

Brilliant tips again - I will check if I'm restarting on the correct channel. In the meantime I found this ref a deeper call into clocks on raspberry pi which may be of use: https://yosh.ke.mu/article/raspberry_pi_getting_time

SimonMerrett commented 3 years ago

so I changed the failure detection section in ncurses to mesh.begin(channel) where uint8_t channel = 50. I will leave it running now to see if that is more consistent. However, there's nothing in failLog.txt so I don't know if that function was called. Will delete so that we know if it is created in the next run.

Edit I rechecked 'failLog.txt' and there was a 1 in there. We could be on to something here.

and I also wondered about the commented out delay here https://github.com/nRF24/RF24Mesh/blob/fce52350416070d3fae40a866119010fad631cc8/RF24Mesh.cpp#L17 as I'm using SPIDEV.

SimonMerrett commented 3 years ago

So I left it again and after about two hours the connection dropped again. I left it 40 mins and there was no reconnection. failLog.tx contained 1. But I'm not sure if the problem is at the Pi end. As a test, I rebooted the Arduino and the packets start flowing and the mqtt messages are successfully being received by the mosquitto server.

I have create a gist with my node code here, in case you can see something silly I have done in there.

After deleting the failLog.txt and waiting for the next irrecoverable drop in connection, I plan to move the Pi away from the desk so that I can bring the Arduino back and hook it up to the serial monitor. Is it worth turning on any of the #defines like DEBUG?

TMRh20 commented 3 years ago

https://gist.github.com/SimonMerrett/b09132b51bb1e9642e240995cf500ba2#file-rf24_mqtt_bulb_string-ino-L117 

Not sure how this works without causing problems, because radio.begin() is not called until mesh.begin(); Either way, you would need to put that line after mesh.begin() or call radio.begin() before setPALevel();

It is difficult to identify issues just looking at code, so I would suggest running the default mqtt example on another device if possible, with the only changes made being ones required to make it operational, pin #s, ip address 

RF24Network_config.h:

    //#define SERIAL_DEBUG
    //#define SERIAL_DEBUG_MINIMAL
    //#define SERIAL_DEBUG_ROUTING

Arduino Sketch, AVR devices need printf enabled for debug output: #include "printf.h" After serial.begin printf_begin();

You said you are using a Nano?

In the debug output, there should be a bunch of these during renwal:

NET Sending id 180 from 04444 to 0100 type 194 //Poll request
MAC Sending to 00 via 00 on pipe 0
NET Pipe 0 on node 00 has address ccccccccc3
MAC Received on 0 id 180 from 00 to 04444 type 194  //Poll response
NET message ffef
System payload rcvd 194
Sending id 180 from 04444 to 00 type 195   //Address request
MAC Sending to 00 via 00 on pipe 0
NET Pipe 0 on node 00 has address ccccccccc3
MAC Received on 0 id 180 from 00 to 04444 type 128  //Address response
NET message 0004
MAC: System payload rcvd 128
setup_address node=04 mask=07 parent=00 pipe=04

...

Attempting MQTT connection... NET Sending id 182 from 04 to 00 type 148  //Fragmented payload
NET Sending id 182 from 04 to 00 type 150  //Fragmented payload

That is mainly what you should see, along with some type 198 messages (lookups) and partial printouts of the IP data. You should be able to see what is happening, if the device is getting poll responses etc, if writes are failing, not getting address responses, interference from another device, etc.

SimonMerrett commented 3 years ago

Oof, really struggling here. I was using an arduino Zero but I can't see a way to enable printf() for SAMD series* so I switched across to a Nano and I can't even get as far as I was with the Zero! The debug output is in the gist but it isn't representative of the issues I was observing with the Zero.

*Tried the libprintf to no avail

TMRh20 commented 3 years ago

How about adding at top of sketch?

define printf Serial.printf

On Aug 28, 2020, at 10:52 AM, SimonMerrett notifications@github.com wrote:

 Oof, really struggling here. I was using an arduino Zero but I can't see a way to enable printf() for SAMD series* so I switched across to a Nano and I can't even get as far as I was with the Zero! The debug output is in the gist but it isn't representative of the issues I was observing with the Zero.

Tried the libprintf to no avail — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

SimonMerrett commented 3 years ago

image

Well I was just about to try that when this happened! I can't even copy/paste it because Win10 clipboard is refusing to capture it properly. So do I need to change the buffer size in #define MAX_PAYLOAD_SIZE 144 in RF24NetworkConfig.h? Interesting that libprintf managed to break through to the serial monitor during the crash!

TMRh20 commented 3 years ago

No that suggests problems with your code. (memory issues: buffer overruns etc)

Again, test with a known working example subscribed to the same topic...

As a developer, I really don't have time to provide technical support, debugging other peoples code etc, it just doesn't make sense/not possible. Users need to test with known working code if finding issues.

On Fri, Aug 28, 2020 at 12:10 PM SimonMerrett notifications@github.com wrote:

[image: image] https://user-images.githubusercontent.com/26767525/91600906-a30ea680-e960-11ea-903d-b137596f357a.png

Well I was just about to try that when this happened! I can't even copy/paste it because Win10 clipboard is refusing to capture it properly. So do I need to change the buffer size in #define MAX_PAYLOAD_SIZE 144 in RF24NetworkConfig.h?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nRF24/RF24Gateway/issues/13#issuecomment-683019550, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAT5KHCGNWPGF77FRQBSDWLSC7XIRANCNFSM4QMBQVIQ .

SimonMerrett commented 3 years ago

No that suggests problems with your code. (memory issues: buffer overruns etc) Again, test with a known working example subscribed to the same topic... As a developer, I really don't have time to provide technical support, debugging other peoples code etc, it just doesn't make sense/not possible. Users need to test with known working code if finding issues.

Sorry - as I said at the beginning I was clutching at straws and I'm going to use the example and build back up. I agree it's not your job to debug my code - sorry and thanks for everything you've done.

TMRh20 commented 3 years ago

No need to apologise, just explaining my point of view. Development and ongoing maintenance takes a huge hit due to all the requests for programming assistance etc and many have nothing to do with RF24, so it's just a matter of limiting things. Like I said, if you can identify issues using the examples etc, please open another issue.

On Fri, Aug 28, 2020 at 12:51 PM SimonMerrett notifications@github.com wrote:

Closed #13 https://github.com/nRF24/RF24Gateway/issues/13.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nRF24/RF24Gateway/issues/13#event-3705251963, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAT5KHBBRWVDBHGUIMTUQBTSC74CVANCNFSM4QMBQVIQ .

SimonMerrett commented 3 years ago

I just noticed the removal of delay() in reconnect(), as fixed in https://github.com/nRF24/RF24Ethernet/commit/09ff6d32e9b3bb7913f326b6a08a2df50c3d8aaf Because I had to adapt the example sketch for mqtt_basic.ino to the SAMD21 my "plain" copy (made before updating the libraries) didn't receive the benefit of the updated example code. That led to my more customised code inheriting the old version of reconnect and I have just changed it. Based on the stability I have seen over the last couple of hours, this could have been the main cause of the issue I experienced. Without throwing too many rocks, a quick look at the buffer and memory management-related issues on the pubsub repo makes me think that's where I should focus effort now.

TMRh20 commented 3 years ago

Excellent. Updates are a dual edged sword sometimes, fixing one issue and causing another. I think it is a memory issue of some sort. I did notice the pubsub client will fragment large payloads depending on the configured size of the client buffer, so modifying MAX_PAYLOAD_SIZE in RF24Network_config.h may actually make a difference if there are issues there, as well as minimizing the length of the messages of course.

SimonMerrett commented 3 years ago

More tests over the last couple of days suggest that I was also suffering from a poor rf link as a more favourable position for rf propagation and reception has also improved stability (still testing with RF24_PA_MIN - moved after mesh.begin()). I don't have a choice to reduce the length of the long messages without digging into zigbee2mqtt but the good news is that for the foreseeable future I don't need to subscribe to any topics, only publish commands. There is also a long message example for the pubsubclient library that I may look into if I need to handle long messages. Due to the rf link being so important, I was thinking of implementing a power ramping scheme in reconnect(). I can't think of a good way to mirror this on the gateway though (to ensure acks get through). If anything comes to mind, please mention it.

TMRh20 commented 3 years ago

I'm not sure about what you mean by mirroring that on the gateway. How would the gateway know when to adjust the power level? I keep my nodes at the max pa level.

One of my nodes dropped into a state where it could renew its address and verify connectivity, but not connect to mqtt. A quick reset brought it back up, so there could be some issue either in RF24Ethernet or mqtt. I may try a few things with the default examples i'm using as time allows, maybe a different mqtt client too just to be sure.

SimonMerrett commented 3 years ago

The gateway absolutely wouldn't know the power level but there may be a scheme which allows you to ensure that at some point all combinations of power are tried at both ends. The wrong but easiest analogy is that a 12hr clock that is slow will eventually be in sync with an accurate clock because they're cycling at different speeds. I really don't know if it's possible but if it were, I imagine it would employ a similar scheme. If the gateway were expecting a known node to always be connected, it could ramp through power levels at rate A for a window of time since the known node was last connected. Meanwhile the known node could ramp through power levels at rate B as it tried to reconnect. Anyway I'm dragging this off topic.

TMRh20 commented 3 years ago

Interesting, trying not to get too far off topic, that would probably require some internal changes to RF24Gateway unless it was just done randomly or based off RF24Mesh/Network level messages. I think I previously tried averaging the RPD values and adjusting power levels, but it never worked out well. In RF24Mesh it is used for debugging: https://github.com/nRF24/RF24Mesh/blob/master/RF24Mesh.cpp#L292

TMRh20 commented 3 years ago

Still trying to find time to fully investigate this issue. I've been testing with the https://github.com/256dpi/arduino-mqtt library and just filed an issue to get it working with IP address, so you need to download the library from github, not the library manager until a release is made to have it work out of the box.

Its working nicely, so should help rule out/in issues between RF24Ethernet and pubsub library.

SimonMerrett commented 3 years ago

Thanks. I tried implementing an oled display and buttons on my samd21-based device but the various calls to mqtt etc were taking so long it killed the responsiveness to the user. So I resorted to a samd21 attached to the Pi with an ethernet spi adapter and a 5cm ethernet cable between them, and basic rf24 messages from the gui samd21 and the rpi samd21. The project was terribly overrun so that was the easiest avenue avaible. But I do want to see where the delays may have originated. BTW, for integration with e.g. zigbee2mqtt where the messages can get quite long, being able to whack the pubsub payload buffer size up to 2048 bytes on the samd21 without fear of piling in on memory is very nice.

TMRh20 commented 3 years ago

I've been using the alternate MQTT library for the past while on all of my nodes and have not seen any downtime on them. This of course indicates a problem between RF24Ethernet and the pubsub lib, although I'm not certain if it is the combination of them or just the mqtt library. I think I'm going to close this issue again though, as there is a working alternative that doesn't seem to be affected.

SimonMerrett commented 3 years ago

Thanks for testing it out with known-working radios. I will take a look at the alternate library.