Closed nullstalgia closed 3 years ago
I'm cool with bumping versions if it doesn't break anything.
Couple of notes:
~
operator should do that automatically.On a side note, I appear to be having a similar issue with the MQTT client. It appears that, after a few hours, it loses connection and gets stuck in the "disconnecting connecting" loop (and thats with this updated lib. So maybe we don't need it.
After a set amount of failures like that, would it be wise to restart the ESP32? But then that might mess with a user... Is there a clean way to restart the MQTT Client without bothering the user in case it's a wrong password/ip/actual connection issue?
The MQTT client should be resetting itself automatically. If it's not, there's a problem. Anything interesting in serial logs or your MQTT broker's logs?
You can force the MQTT client to be recreated from scratch by saving settings, but that's pretty barbaric lol. The client should be robust to disconnects. (btw - I have not had disconnect issues since fixing the issue you helped find w/ the client)
Well, actually, I forgot to mention.
Once that lock up happens, everything locks up. Web UI, Templates, API.
I do have this log when it happened when I had the Serial open
Formatted value: /b/circle.bin
Rendering bit
_PowerOn : 35001
_Update_Part : 562001
_PowerOff : 20001
Formatted value: As of: 09:18PM
Rendering timestamp
_PowerOn : 35001
_Update_Part : 562001
_PowerOff : 20001
Formatted value: As of: 09:19PM
Rendering timestamp
_PowerOn : 35001
_Update_Part : 562001
_PowerOff : 20001
MqttClient - disconnected
MqttClient - connecting
MqttClient - disconnected
MqttClient - connecting
MqttClient - disconnected
MqttClient - connecting
MqttClient - disconnected
MqttClient - connecting
MqttClient - disconnected
MqttClient - connecting
MqttClient - disconnected
MqttClient - connecting
MqttClient - disconnected
MqttClient - connecting
Mosquitto
1586074757: New client connected from 192.168.86.45 as epaper-display-1168420400 (p2, c1, k60, u'tony').
1586074888: New connection from 192.168.86.47 on port 1883.
1586074888: New client connected from 192.168.86.47 as epaper-display-1823347020 (p2, c1, k60, u'tony').
1586074979: Client epaper-display-1823347020 has exceeded timeout, disconnecting.
1586075692: Saving in-memory database to /mosquitto/data/mosquitto.db.
1586077493: Saving in-memory database to /mosquitto/data/mosquitto.db.
1586078472: New connection from 192.168.86.47 on port 1883.
1586078472: New client connected from 192.168.86.47 as epaper-display-1823347020 (p2, c1, k60, u'tony').
1586078562: Client epaper-display-1823347020 has exceeded timeout, disconnecting.
1586079294: Saving in-memory database to /mosquitto/data/mosquitto.db.
1586081095: Saving in-memory database to /mosquitto/data/mosquitto.db.
1586082061: New connection from 192.168.86.47 on port 1883.
1586082061: New client connected from 192.168.86.47 as epaper-display-1823347020 (p2, c1, k60, u'tony').
1586082151: Client epaper-display-1823347020 has exceeded timeout, disconnecting.
1586082896: Saving in-memory database to /mosquitto/data/mosquitto.db.
1586084697: Saving in-memory database to /mosquitto/data/mosquitto.db.
1586085070: Client epaper-display-1168420400 has exceeded timeout, disconnecting.
1168420400/192.168.86.45 is the one in question
1823347020/192.168.86.47 is one I have in deep sleep
It is powered on, and has a connection to the display, but stopped at 4:09am (timestamp of disconnect was 4:11am)
Got it. According to the serial monitor, the ESP32 thinks the latest timestamp should be 9:19PM. Am I reading that right?
I don't know why MQTT would go into a spin loop, but seems like a problem if the screen isn't updating when it's supposed to. Maybe double-check that the connections are super solid?
Agh, sorry, I should have been clearer. If I opened the Serial monitor right now, it would restart the chip. That is an excerpt from a log that happened earlier in the day, but exact same symptoms.
If I restart the ESP, it works just fine.
But I think you missed me saying that even the Web UI and API lock up. I can't connect to it via the browser.
And looking at my router's settings, it's just dropped off the network entirely. Seems like this isn't an MQTT/API/WebServer issue.
Gotcha. How active are the serial logs when the system is hung like this? Is it spamming you constantly, is it a slow trickle, or is there just no output after a certain point?
It's kind of inbetween spam and a trickle.
Maybe, every second/half second?
Okay, so seems like the issue is it's not successfully reconnecting to wifi after it's dropped.
I had a terrible time with this a few years ago (see this issue -- i was trying to help debug if you check the buried comments), but I thought I'd fixed it.
Maybe just need a more explicit check/wifi reconnect retry in loop
.
https://github.com/espressif/arduino-esp32/issues/3168
I also made an issue a while back but had no response.
What I ended up doing was this
...
loop()
...
current_millis = millis()
if (WiFi.status() == WL_CONNECTED) {
wifiConnected = current_millis;
}
// If disconnected for more than 5 mins, restart whole chip.
if (current_millis - wifiConnected > (5 * 60 * 1000)) {
ESP.restart();
}
If memory serves, the issue is that automatic reconnection via WiFi.setAutoReconnect(true)
just doesn't work the way it's supposed to. I thought I'd handled it in a more robust way somewhere, but I'm not seeing it.
Would be nice to avoid resetting if possible (although that's a nice fallback).
Are you able to try something like the suggestion in this comment?
Ok, I'm not crazy. I used to have more aggressive reconnect logic, but I removed it in v1.0.0 --
I think the platform upgrade (0.12 -> 0.18) fixed it on my network, so I removed it.
Should probably add it back. And agree with the guy in the comment -- most bulletproof approach is to add something in loop.
After going back to the original dependencies list, this is my loop()
void loop() {
if (shouldRestart) {
ESP.restart();
}
if (timeClient.update() && lastSecond != second()) {
lastSecond = second();
driver->updateVariable("timestamp", String(timeClient.getEpochTime()));
}
if (webServer) {
webServer->handleClient();
}
if (!suspendSleep && initialSleepMode == SleepMode::DEEP_SLEEP) {
if (digitalRead(settings.power.sleep_override_pin) == settings.power.sleep_override_value) {
Serial.println(F("Sleep override pin was held. Suspending deep sleep."));
suspendSleep = true;
} else if (millis() >= (settings.power.awake_duration * 1000)) {
Serial.printf_P(
PSTR("Wake duration expired. Going to sleep for %d seconds...\n"),
settings.power.sleep_duration);
Serial.flush();
// Make sure the display is off while we sleep
if (display) {
display->hibernate();
}
// Convert to microseconds
esp_sleep_enable_timer_wakeup(settings.power.sleep_duration * 1000000ULL);
esp_deep_sleep_start();
}
}
driver->loop();
if ( WiFi.status() == WL_CONNECTED )
{
// WiFi is UP, do what ever
} else
{
// wifi down, reconnect here
WiFi.begin( );
int WLcount = 0;
int UpCount = 0;
while (WiFi.status() != WL_CONNECTED && WLcount < 200 )
{
delay( 100 );
Serial.printf(".");
if (UpCount >= 60) // just keep terminal from scrolling sideways
{
UpCount = 0;
Serial.printf("\n");
}
++UpCount;
++WLcount;
}
}
} // END loop()
And the logs from last night:
Seems like it's not reconnecting then?
Might need to force-disconnect.
To me it looks like the MQTT broker keeps trying and doesn't let the WiFi-reconnect take a shot at it until 4 hours later.
So maybe having a proper disconnect in the broker would be smart. (Which they did adjustments to in the repo. I'll send a issue now about updating it.
Also, I found some forks to the original library that maybe we should take a peek at.
https://github.com/dx168b/async-mqtt-client/commit/2e85978ba0e801d7bbe33e510ce248f5aa812823
https://github.com/mcspr/async-mqtt-client/commit/c1fcfd1bc29cc0e908607fed110e9bc67de3e5b0
Formatted value: As of: 01:48PM
Rendering timestamp
_PowerOn : 35004
_Update_Part : 562001
_PowerOff : 20001
Formatted value: As of: 01:49PM
Rendering timestamp
_PowerOn : 35004
_Update_Part : 562001
_PowerOff : 20001
And at 1:49 it just went "poof" again. No disconnect or nothing. I think I sat down around that time at my desk, so maybe something got rustled? Unsure.
This is all super helpful, much appreciated.
Does everything come back online after WiFi reconnects?
The first fork in particular looks pretty interesting and worth trying.
It seems like it, however I didn't try to go into the UI until after it froze again at 1:49
I'll try that tomorrow after it inevitably does it again, lol.
So I've been having issues with another ESP32 I have on my work desk, it too is losing connection. So it appears that this is an issue for my network, rather than the project.
(Even having another WiFi router acting as an AP instead of using my normal Google Wifi Mesh connected didn't help. :/ )
Anyway, I think I forgot to mention this.
I agree with using semvers via the ~
operator, but that only updates based on the Patch Maj.Min.Patch
But this is up to you.
Bummer, that's too bad. Did the wifi reconnect in the loop help at all?
Should've been more clear re: patch versions, just meant to share that tidbit.
Not really? I'd have to test more, to be perfectly honest. But it didn't seem to make a difference on first glance.
please help me i am getting this error starting with Platformio
ImportError: No module named pathlib: File "/Users/brayamruiz/.platformio/penv/lib/python2.7/site-packages/platformio/builder/main.py", line 154: env.SConscript(item, exports="env") File "/Users/brayamruiz/.platformio/packages/tool-scons/script/../engine/SCons/Script/SConscript.py", line 541: return _SConscript(self.fs, *files, **subst_kw) File "/Users/brayamruiz/.platformio/packages/tool-scons/script/../engine/SCons/Script/SConscript.py", line 250: exec file in call_stack[-1].globals File "/Users/brayamruiz/Downloads/epaper_templates-master/scripts/platformio/build_web.py", line 8: from pathlib import Path
Just installed a build of epaper_templates with most deps updated to newest.
Async MQTT Client needed this, as the lib on PlatformIO is outdated by 2 years.
marvinroger/async-mqtt-client#7f1ba48
Full
lib_deps_external
What spurred this, is that I was having a similar MQTT issue like before, and was trying to see if this would solve it. Only time will tell, but maybe this is worth consideration as well?