Closed jddmartin closed 1 year ago
I think I am also seeing this, https://discord.com/channels/327254708534116352/537365702651150357/1035672894761619547
@calcut Do you have some comments on this apparently new problem?
Hmm, well this bit is new
If the mqtt broker keeps providing a response to _wait_for_msg, then it will keep looping for ~10 seconds (_recv_timeout) before giving up.
You could try adding a logger, or at least some print statements to verify if that is what is happening.
have you tried with timeout = 0
?
I tried with the default timeout of 0 as well as a few various values, though I'm not sure my issue is the same as the original one in this issue description
@calcut
Thanks for your response.
Yes, the duration of call is about 10 s (I should have been more precise in my original message).
1) If I set timeout = 0, the .loop() doesn't return at all
2) I have monitored tcp packets between mqtt server and the device. There are only a few packets sent every few iterations of the loop. Does that rule out mqtt broker keeps providing a response to _wait_for_ms?
Unfortunately my knowledge of the MQTT protocol is poor --- I tried to simplify my application code to give minimal working example, but I'm not 100% sure that it is adequate to diagnose problem (for example, I don't subscribe in client in this mwe, so maybe there shouldn't be any communication at all (?). I will try to understand how to add logging/print statements to diagnose, as you suggest.
interesting... I wouldn't call myself an expert either I've really just tested with the Adafruit IO broker, but perhaps other brokers behave differently. Maybe returning some sort of keep-alive or ping response, guessing here.
To debug... my strategy was essentially:
You could try printing all responses received, something like this:
while True:
rc = self._wait_for_msg(timeout)
if rc is None:
break
print(f"received: {rc}")
if time.monotonic() - stamp > self._recv_timeout:
if self.logger is not None:
self.logger.debug(
f"Loop timed out, message queue not empty after {self._recv_timeout}s"
)
print(f"timed out after {self._recv_timeout}s")
break
rcs.append(rc)
I was using a locally-running mosquitto broker, if that helps
This is definitely a problem for me too. I can't seem to get any combination of minimqtt running on a Matrix Portal M4.
The loop call hangs for several seconds. Even with Asyncio, this blocks other coroutines and prevents updates to the RGB matrix.
This is pretty bad for a device that claims to be IoT friendly and doesn't even have a useable MQTT client.
I'll take another look today. What broker are you connecting to @jinglemansweep ? As I mentioned, I haven't had a problem using with AdafruitIO, so maybe other brokers behave differently.
I'm actually considering taking a look myself after work commitments today. I'm no expert on MQTT protocol though.
I'm using Mosquitto mainly because I want to integrate with Home Assistant. Having a self hosted broker is a must for my uses unfortunately.
Happy to provide any help if needed.
I've just tested with mosquitto running on a RPi, and also on a Mac. I didn't do any configuration of mosquitto except to turn off the security with
listener 1883 0.0.0.0
allow_anonymous true
I started with the simpletest: https://raw.githubusercontent.com/adafruit/Adafruit_CircuitPython_MiniMQTT/main/examples/minimqtt_simpletest.py
which appears to work fine (assuming the broker ip and port are specified in secrets.py)
I also added this, but didn't observe any delays in the loop() function
while True:
mqtt_client.loop()
print("loop done")
time.sleep(0.5)
That was with ESP32-S2 and Circuitpython 7.3.1 I have a metro M4 somewhere, I'll try with that too.
" I have a metro M4 somewhere, I'll try with that too." Or not! that board doesn't have networking haha
interesting... I wouldn't call myself an expert either I've really just tested with the Adafruit IO broker, but perhaps other brokers behave differently. Maybe returning some sort of keep-alive or ping response, guessing here.
To debug... my strategy was essentially:
* remove the adafruit_minimqtt.mpy file from your CIRCUITPY/lib/adafruit_minimqtt folder (or rename it with a different extension) * replace it with a .py version, e.g. https://raw.githubusercontent.com/adafruit/Adafruit_CircuitPython_MiniMQTT/aec777753afc4460832dc943666de856da41e873/adafruit_minimqtt/adafruit_minimqtt.py * add print statements or other modifications
You could try printing all responses received, something like this:
while True: rc = self._wait_for_msg(timeout) if rc is None: break print(f"received: {rc}") if time.monotonic() - stamp > self._recv_timeout: if self.logger is not None: self.logger.debug( f"Loop timed out, message queue not empty after {self._recv_timeout}s" ) print(f"timed out after {self._recv_timeout}s") break rcs.append(rc)
@jinglemansweep Maybe you could try this... the broker may be returning something that causes this loop to continue
Is everyone who is having a problem using non-native networking I wonder?
This function executes quite differently for "ESP32SPI", which will be what the M4 Matrix Portal uses.
I guess it doesn't return None
, but I don't have a suitable board to check what it does return.
I will give this a go later this afternoon. I actually have another Matrix Portal M4 arriving today which I'll use for testing.
great. poking a little deeper, it looks like
_sock_exact_recv should raise OSError(errno.ETIMEDOUT) https://github.com/adafruit/Adafruit_CircuitPython_MiniMQTT/blob/aec777753afc4460832dc943666de856da41e873/adafruit_minimqtt/adafruit_minimqtt.py#L989
which should be caught by
and cause the loop() to exit.
If that isn't happening, see if you an figure out what is going on.
@calcut thank you for suggestion on diagnostics and sorry for the delay in responding.
When I put in those suggested print statements (8th message in this thread), I get:
received: 0
received: 0
received: 0
...
received: 0
received: 0
timed out after 10s
Thanks.
interesting... I wouldn't call myself an expert either I've really just tested with the Adafruit IO broker, but perhaps other brokers behave differently. Maybe returning some sort of keep-alive or ping response, guessing here.
To debug... my strategy was essentially:
- remove the adafruit_minimqtt.mpy file from your CIRCUITPY/lib/adafruit_minimqtt folder (or rename it with a different extension)
- replace it with a .py version, e.g. https://raw.githubusercontent.com/adafruit/Adafruit_CircuitPython_MiniMQTT/aec777753afc4460832dc943666de856da41e873/adafruit_minimqtt/adafruit_minimqtt.py
- add print statements or other modifications
You could try printing all responses received, something like this:
while True: rc = self._wait_for_msg(timeout) if rc is None: break print(f"received: {rc}") ...
So, I'm getting 0
instead of None
. That's with no messages queued or being received, in its idle waiting state:
received: 0
received: 0
received: 0
<snip>
Changing the above line to:
if rc is None or rc == 0:
break
... fixes the issue. My RGB matrix updates with a decent framerate, with a mqtt_client.loop(timeout=0.001)
call.
I can confirm that I can publish MQTT messages and my application receives them and logs them without blocking the RGB matrix updates:
MANAGER::RUN
MANAGER::LOOP
Subscribed to test/topic with QOS level 1
New message on topic test/topic: HELLLLO
New message on topic test/topic: HELLLLO
I'm using my project for testing: https://github.com/jinglemansweep/matrix-portal-asyncio
I'm happy to submit a PR, but I only have Matrix Portal M4's to test with, so my change could break other boards.
Actually, looking at _wait_for_msg
function:
# Block while we parse the rest of the response
self._sock.settimeout(timeout)
print(res)
if res in [None, b""]:
# If we get here, it means that there is nothing to be received
return None
Adding the print
statement returns the following:
bytearray(b'\x00')
bytearray(b'\x00')
bytearray(b'\x00')
bytearray(b'\x00')
So, the socket is receiving constant single zero bytes. Perhaps a better fix would be adding 0
to the "nothing received" blacklist?
if res in [None, b"", 0]:
# If we get here, it means that there is nothing to be received
return None
Okay, that doesn't work for some reason. If _wait_for_msg
returns None
at that stage, it appears to cause too many polls, locking up the RGB matrix and eventually causing a MemoryError
:
File "/lib/adafruit_minimqtt/adafruit_minimqtt.py", line 874, in loop
MemoryError: memory allocation failed, allocating 32768 bytes
Okay, my mistake. That line should probably be:
if res in [None, b"", bytearray(b'\x00')]:
# If we get here, it means that there is nothing to be received
return None
With just that change, everything appears to be working as expected.
The most recent suggestion of @jinglemansweep fixes the issue for me (RP2040 feather, with ethernet feather wing, talking to mosquitto broker on LAN).
nice work @jinglemansweep, you up for doing a PR to fix it?
Sure. I have a branch ready but didn't have permission to push to it
Sure. I have a branch ready but didn't have permission to push to it
This is on your end, you mean? A PR is fine.
Yes, I usually create a branch, push my changes to my branch and submit a PR to get it merged into master/main
I get permission denied when pushing my branch:
ERROR: Permission to adafruit/Adafruit_CircuitPython_MiniMQTT.git denied to jinglemansweep.
fatal: Could not read from remote repository.
Yes, I usually create a branch, push my changes to my branch and submit a PR to get it merged into master/main
I get permission denied when pushing my branch:
ERROR: Permission to adafruit/Adafruit_CircuitPython_MiniMQTT.git denied to jinglemansweep. fatal: Could not read from remote repository.
Did you fork the repo? If you want to make a branch, do it on your fork; you can't make a branch on the adafruit
repo and push it (you could make it only locally, but better to fork). We have a guide here: https://learn.adafruit.com/contribute-to-arduino-with-git-and-github
Sorry, just realised you use the fork model:
My PR from my fork is here: https://github.com/adafruit/Adafruit_CircuitPython_MiniMQTT/pull/127
Fixed by #127, but could use further diagnosis. Redid issue in #128.
There is a recent commit 48aba3c63fc3384f426bd08ededff7d6c2d4fa98 which causes a lengthy delay in calls to .loop() in one of my test programs, which did not occur prior to this commit.EDIT: Apologies, I made a mistake in identifying a specific commit. The commit tagged 5.5.2 shows the lengthy delay, and commits tagged 5.5.1 and 5.5.0 do not show the problem.
I am using an ethernet featherwing attached to a rp2040 feather. Running the test code below (a modified version of example programs for minimqtt) before and after commit shows a lengthy (several second) delay introduced in calls to .loop() (the third last line in code). Whereas prior to this commit, loop returns quickly; i.e., on order of specified timeout.
I'm running circuitpython 7.3.3, but I get identical slow behaviour with 8.0.0 and most recent version of minimqtt.