shmuelzon / esp32-ble2mqtt

A BLE to MQTT bridge running on an ESP32
MIT License
662 stars 106 forks source link

No reconnect on wifi disconnect #68

Open bssstudio opened 4 years ago

bssstudio commented 4 years ago

I am also having trouble with the app not reconnecting to the wifi AP if disconnected. I have inspected the code and it should do it in the wifi.c. However the debug log looks like the event handler in wifi.c is not considered (at least not logged).

D (42859) BLE: Received GAP event 3 (ESP_GAP_BLE_SCAN_RESULT_EVT)
D (44099) BLE: Received GAP event 3 (ESP_GAP_BLE_SCAN_RESULT_EVT)
I (44479) wifi: state: run -> init (2c0)
I (44479) wifi: pm stop, total sleep time: 0 us / 20159085 us

I (44479) wifi: n:7 0, o:7 1, ap:255 255, sta:7 1, prof:1
D (44489) event: SYSTEM_EVENT_STA_DISCONNECTED, ssid:XXXX, ssid_len:6, bssid:76:4d:28:xx:xx:xx, reason:2
V (44499) event: enter default callback
shmuelzon commented 4 years ago

Hey,

I can't tell exactly what the problem is here, but the event handler in wifi.c is definitely registered because the application waits until WiFi is connected before starting MQTT and BLE (which seem active in your logs). The SYSTEM_EVENT_STA_DISCONNECTED event is handled there and initiates the connection once again.

Is the snippet above the complete log or was there anything printed after it?

Thanks

bssstudio commented 4 years ago

I will test again, but I think this is the complete log. I was also expecting the described behavior as it is defined in wifi.c. However, if it reached the event handler, it would have been logged with tag "WiFI" and not "event". I can recreate the scenario by kicking the client (ESP32) from the wifi AP.

jamesfcurran commented 4 years ago

Hi Folks, Any additional info on this issue? I am also having an issue where the devices eventually stop talking to the MQTT server and fall off the WiFi network. Am about to dig in on it but wanted to see if there was anything new from bssstudio. Thanks! Jim

bssstudio commented 4 years ago

Hi! I am also still experiencing non-reconnects. However, due to governmental restrictions in place I am currently unavailable to get physical access to this installation. I might try to dig through the drawers if I find another ESP32 to reproduce this issue at home. I remember that in logs there was SYSTEM_EVENT_STA_DISCONNECTED that got fired on disconnect from WiFi and it somehow was not handled by the code. I mean it looks like it should have been, but there is just no other log entries after that and the node stays disconnected. Hope it helps. Best, Blaz

shmuelzon commented 4 years ago

@jamesfcurran, could you please compile this project with debug log level and post the logs when it disconnects from WiFi? I have a few devices running at home but I’ve never had this kind of problem.

Also, just to make sure, which ESP-IDF version are you using?

Thanks!

jamesfcurran commented 4 years ago

Thanks for the responses. I also am prevented from getting to my devices but I do have one or two esp32s around here. I am using ESP-IDF v 3.2.

Out of 9 devices deployed and reporting last night, only 5 continue to report to the MQTT server. Maybe flakey wifi? Am using the latest version of the Mosquitto broker.

Will try to bring up a few devices here at home in the next couple of days. Should I change the log level in sdkconfig?

Thanks, Jim

shmuelzon commented 4 years ago

When you’re able, could you please test with v3.2.2? That’s what I test with and perhaps there were some relevant fixes.

Yes, log level is changed via the configuration menu (make menuconfig).

Thanks for helping troubleshoot this!

mibas74 commented 4 years ago

Hi, i've seen similar, maybe this same behaviour - it seems when it looses connection it hangs, and there is no log about it, because there is no connection. Is it possible to add something like a wifi watchdog please? (to restart device if the wifi connection will be lost for lets say 5 minutes ?)

Best Regards Michal

shmuelzon commented 4 years ago

Hey Michal,

Yeah, remote logging won’t help here. I’ve also seen this issue a few times when I had WiFi problems at home. I think the problem is that the underlying WiFi task notices the disconnection and sends an event that this application should receive and act upon (reconnect). However, the event never reaches the application.

On one ESP32 device this happens intermittently but never happened on another, I need to see if they are the same HW version. Once, when it happened, I also saw some timeout message in the serial logs that relates to events that couldn’t be delivered. Maybe the application thread is stuck somewhere? I tried to add more memory to the application stack in case something got overwritten but that didn’t help. I also tried to update the project to ESP-IDF v4.1 (beta) since they added a new event mechanism but after a few days something similar happened (I didn’t have it connected to serial that time).

Anyways, I can’d add the watchdog since the app doesn’t know WiFi was disconnected. I have a few more things to try but not a lot to go on. If anyone has any serial logs or ideas, please let me know.

I’ll update once I find something concrete.

Thanks!

jamesfcurran commented 4 years ago

Just an update (no solution). I am watching one of my BLE2MQTT devices with an 8266 running a serial to wifi bridge (https://github.com/jeelabs/esp-link).

The only thing out of the ordinary in about 3 days of monitoring is a brownout warning that I think caused the device to lock up. Now, I have the 8266 hanging off of the 3.3v supply on the ESP32 device so that could easily be the cause but I have never had a lot of confidence in my USB power supplies (or my line power, for that matter).

Will continue to watch.
Jim

shmuelzon commented 4 years ago

I think brownouts should reset the ESP32, at least that's the default in the configuration menu. That being said, that's an interesting observation since the only device I've seen this behavior on does have a problematic DC to DC converter and I had to disable brownout detection to get it to actually run most of the time.

Perhaps using both WiFi and BLE takes it toll. I disable WiFi power saving and set BLE to the highest transmission power.

jamesfcurran commented 4 years ago

Thanks for the reply. Here is a sanitized copy of the logging from what appears to be a wifi event this afternoon. Netadmins assure me the wireless was fine. The device recovered and continues to work. But the logs are quite noisy maybe worth a look? Note I filter out the copious discovery and SCAN_RESULT messages during collection.

Also, do you think I should try with IDF v 3.2.2 ? I am going to look into the power settings and also eliminate the 8266 from the power feed.

wifi_bridge.20200611-14.txt

Thanks, Jim

shmuelzon commented 4 years ago

Which IDF are you using? v3.2.2 is the one I’m using and it proved to be the most stable. I have a version for v4.0 and v4.1 but they seem to fail more often.

jamesfcurran commented 4 years ago

Am using v3.2 now

shmuelzon commented 4 years ago

OK, there were some bug fixes between the two versions. I guess it can’t hurt to upgrade but I don’t think it will solve this particular issue.

mibas74 commented 4 years ago

Hi, using esp-link (Thanks for advise!) I've found I think slightly different bahaviour - it hangs on the BT side, but MQTT seems to work, that means there is still WiFi working, but BT not. BLE devices are Xiaomi temperature and humidity sensors.

Device 1:

_

[0;32mI (200122) BLE2MQTT: Publishing: 58:2d:34:36:1a:98/226c0000-6476-4566-7562-66734470666d/226caa55-6476-4566-7562-66734470666d = 84,61,50,52,46,53,32,72,61,54,48,46,54,0
I (200132) BLE2MQTT: Publishing: 58:2d:34:36:1a:98/226c0000-6476-4566-7562-66734470666d/226caa55-6476-4566-7562-66734470666d = 84,61,50,52,46,53,32,72,61,54,48,46,56,0
I (200222) BLE2MQTT: Discovered Mijia Temp+Hum broadcaster
I (200232) BLE2MQTT: Discovered BLE device: 58:2d:34:36:1f:d6 (RSSI: -80), connecting
I (200242) BLE2MQTT: Publishing: 58:2d:34:36:1a:98/DeviceInformation/ModelNumberString = Duck_Release
I (200272) BLE2MQTT: Publishing: 58:2d:34:36:1a:98/DeviceInformation/SerialNumberString = X-46���������
I (200302) BLE2MQTT: Publishing: 58:2d:34:36:1a:98/DeviceInformation/HardwareRevisionString = 2.00
I (200602) BLE2MQTT: Publishing: 58:2d:34:36:1a:98/DeviceInformation/FirmwareRevisionString = 00.00.66
I (200832) BLE2MQTT: Publishing: 58:2d:34:36:1a:98/00001530-1212-efde-1523-785feabcd123/00001534-1212-efde-1523-785feabcd123 = 1,0
I (200972) BLE2MQTT: Publishing: 58:2d:34:36:1a:98/0000fe95-0000-1000-8000-00805f9b34fb/00000002-0000-1000-8000-00805f9b34fb = 170,1
W (201272) BT_APPL: bta_gattc_conn_cback() - cif=1 connected=0 conn_id=513 reason=0x0013
E (201282) BLE: Failed reading characteristic, status = 0x85
I (201282) BLE: Connection closed, reason = 0x13
I (201282) BLE2MQTT: Disconnected from device: 58:2d:34:36:1a:98
I (210362) BLE2MQTT: Connected to device: 58:2d:34:36:1f:d6, scanning
E (210372) BT_BTC: btc_gattc_cback transfer failed

Failed sending remote log: 12
E (210422) OUTBOX: /home/michal/esp32/esp-idf/components/mqtt/esp-mqtt/lib/mqtt_outbox.c:37 (outbox_enqueue): Memory exhausted
Failed sending remote log: 12
E (210432) MQTT_CLIENT: Error write data or timeout, written len = -1
Failed sending remote log: 12
E (210432) BT_HCI: host_recv_pkt_cb couldn't aquire memory for inbound data buffer.

Failed sending remote log: 12
E (210442) MQTT_CLIENT: Error to subscribe topic=58:2d:34:36:1f:d6/0000fe95-0000-1000-8000-00805f9b34fb/00000014-0000-1000-8000-00805f9b34fb/Set, qos=0
Failed sending remote log: 12
E (210462) MQTT: Failed subscribing to 58:2d:34:36:1f:d6/0000fe95-0000-1000-8000-00805f9b34fb/00000014-0000-1000-8000-00805f9b34fb/Set
Failed sending remote log: 12
I (216282) wifi: bcn_timout,ap_probe_send_start
Failed sending remote log: 12
W (216282) wifi: alloc eb len=76 type=2 fail, heap:968

Failed sending remote log: 12
W (216292) wifi: m f probe req l=0

Failed sending remote log: 12
W (216792) wifi: alloc eb len=76 type=2 fail, heap:968

Failed sending remote log: 12
W (216792) wifi: m f probe req l=0

Failed sending remote log: 12
W (217292) wifi: alloc eb len=76 type=2 fail, heap:968

Failed sending remote log: 12
W (217302) wifi: m f probe req l=0

Failed sending remote log: 12
W (217802) wifi: alloc eb len=76 type=2 fail, heap:968

Failed sending remote log: 12
W (217802) wifi: m f probe req l=0

Failed sending remote log: 12
W (218302) wifi: alloc eb len=76 type=2 fail, heap:968

Failed sending remote log: 12
W (218302) wifi: m f probe req l=0

c eb len=28 type=3 fail, heap:968

Failed sending remote log: 12
W (218822) wifi: m f disassoc

Failed sending remote log: 12
I (218822) wifi: pm stop, total sleep time: 0 us / 216776377 us

Failed sending remote log: 5
I (218832) wifi: n:1 0, o:1 1, ap:255 255, sta:1 1, prof:1
Failed sending remote log: 5
E (218842) MQTT_CLIENT: Read error or end of stream
Failed sending remote log: 5
I (218842) WiFi: Disconnected
Failed sending remote log: 5
I (218852) MQTT: MQTT client disconnected
I (218872) Log: Disabled remote logging
I (218872) BLE2MQTT: Disconnected from WiFi, stopping MQTT
I (218872) MQTT: Disconnecting MQTT client
I (218922) BLE2MQTT: Disconnected from MQTT, stopping BLE
W (218922) BT_APPL: bta_gattc_conn_cback() - cif=1 connected=0 conn_id=1 reason=0x0016
I (218932) BLE: Connection closed, reason = 0x0
I (218932) BLE2MQTT: Disconnected from device: 58:2d:34:36:1f:d6
I (220912) WiFi: Disconnected
I (220912) BLE2MQTT: Disconnected from WiFi, stopping MQTT
I (220912) MQTT: Disconnecting MQTT client
I (221032) wifi: n:1 1, o:1 0, ap:255 255, sta:1 1, prof:1
I (221032) wifi: state: init -> auth (b0)
I (221042) wifi: state: auth -> assoc (0)
I (221042) wifi: state: assoc -> run (10)
I (221072) wifi: connected with @Home2, channel 1
I (221072) wifi: pm start, type: 0

I (221072) WiFi: Connected
I (221852) event: sta ip: 192.168.0.236, mask: 255.255.0.0, gw: 192.168.0.1
I (221852) Log: Enabling remote logging
I (221852) BLE2MQTT: Connected to WiFi, connecting to MQTT
I (221852) MQTT: Connecting MQTT client
I (221862) system_api: Base MAC address is not set, read default base MAC address from BLK0 of EFUSE
I (221882) MQTT_CLIENT: Sending MQTT CONNECT message, type: 1, id: 0000
I (221892) MQTT: MQTT client connected
I (221892) MQTT: Publishing from queue: 58:2d:34:36:1f:d6/Connected = false
I (221902) BLE2MQTT: Connected to MQTT, scanning for BLE devices
E (229912) BT_HCI: command_timed_out hci layer timeout waiting for response to a command. opcode: 0x406
E (908892) BT_HCI: command_timed_out hci layer timeout waiting for response to a command. opcode: 0x406

------------
restart of router/WiFi AP
------------

TT: Disconnected from WiFi, stopping MQTT
I (3198042) MQTT: Disconnecting MQTT client
I (3198162) wifi: n:1 1, o:1 0, ap:255 255, sta:1 1, prof:1
I (3198162) wifi: state: init -> auth (b0)
I (3198162) wifi: state: auth -> assoc (0)
I (3198172) wifi: state: assoc -> run (10)
I (3198202) wifi: connected with @Home2, channel 1
I (3198202) wifi: pm start, type: 0

I (3198202) WiFi: Connected
I (3219852) event: sta ip: 192.168.0.236, mask: 255.255.0.0, gw: 192.168.0.1
I (3219852) Log: Enabling remote logging
I (3219852) BLE2MQTT: Connected to WiFi, connecting to MQTT
I (3219852) MQTT: Connecting MQTT client
I (3219862) system_api: Base MAC address is not set, read default base MAC address from BLK0 of EFUSE
I (3219882) MQTT_CLIENT: Sending MQTT CONNECT message, type: 1, id: 0000
I (3219892) MQTT: MQTT client connected
I (3219892) BLE2MQTT: Connected to MQTT, scanning for BLE devices
_

and it still hangs here..

mibas74 commented 4 years ago

and after some hours another device:

_I (1219938) wifi: pm start, type: 0

I (1219948) WiFi: Connected
I (1220848) event: sta ip: 192.168.0.113, mask: 255.255.0.0, gw: 192.168.0.1
I (1220848) Log: Enabling remote logging
I (1220848) BLE2MQTT: Connected to WiFi, connecting to MQTT
I (1220858) MQTT: Connecting MQTT client
I (1220858) system_api: Base MAC address is not set, read default base MAC address from BLK0 of EFUSE
I (1220878) MQTT_CLIENT: Sending MQTT CONNECT message, type: 1, id: 0000
I (1220888) MQTT: MQTT client connected
I (1220888) MQTT: Publishing from queue: 58:2d:34:36:18:9c/Connected = false
I (1220898) BLE2MQTT: Connected to MQTT, scanning for BLE devices
E (1228908) BT_HCI: command_timed_out hci layer timeout waiting for response to a command. opcode: 0x406
E (1808888) BT_HCI: command_timed_out hci layer timeout waiting for response to a command. opcode: 0x406_

seems to be esp-idf issue: https://github.com/espressif/esp-idf/issues/3944

shmuelzon commented 4 years ago

To be honest, I'm not sure what to make of these... You present BLE issues but this thread originally discussed WiFi not reconnecting. In the cases I've seen this issue it was truly WiFi that was disconnected as I didn't get any ping responses. There may be other BLE issues as well but I don't know if it somehow relates here. Most people use the ESP32 for WiFi only so maybe there are still coexistence issues. Maybe we're having some memory issues. I tried increasing the application's stack but that had no affect. Maybe we're pushing the MQTT's stack limit and we're overwriting some memory? I would have expected to see more complaints about this issue if it were cause by WiFi alone.

jamesfcurran commented 4 years ago

Update - well -- devices are still hanging unexpectedly/unpredictably. I added code to reboot every 4 hours or upon disconnect from WiFi but the devices still get confused. I have a great deal of logging that I can share @shmuelzon if you are interested. I have been looking at the many WiFi issues reported against esp-idf so going to 4.1 seems interesting but the new esp32-ble2mqtt Ethernet feature seems to be ad odds with the latest stable IDF.

Any other thoughts ?

shmuelzon commented 4 years ago

Do the logs show any errors or warnings or do they just stop at some point? If you're willing, I just push a new branch for supporting ESP-IDF v4.1rc. It's not really tested but at least it compiles. Yes, Ethernet support seems to have changed a lot since v3.2. I removed many things that I don't know if they're not supported anymore, supported internally without the users needing to set anything or the configuration just changed and it wasn't obvious to me where it moved. Either way, you're free to try it and I'd appreciate any feedback I can get.

Thanks!

jamesfcurran commented 4 years ago

Happy to try out the new release. Thank you.

ESP log is attached. I am filtering out most business-as-usual messages:

>   next if ($line=~/ESP_GAP_BLE_SCAN_RESULT_EVT/);
>   next if ($line=~/discovered/i);
>   next if ($line=~/Queuing event HEARTBEAT_TIMER/);

Typically the esp32 stays up (responding to ping) but no longer reports BLE events and is not connected to the MQTT server. One might blame the WiFi and I can only say that other devices have no problem staying connected including the 8266 that is capturing logs from the ESP32 via the UART.

Jim

esp-link.txt

jamesfcurran commented 4 years ago

Quick update - Updating to ESP-IDF 4.1rc1 and the ESP-IDF 4.x branch of esp32_ble2mqtt

Updated 9 devices at work: 2 of them locked up within 24 hours At work, the 9 esp32s are tracking about 40 iBeacons, each one transmitting once every 7 seconds.

At home, the esp32 is only tracking my keychain and a iBeacon client on my cell phone Updated 1 device at home: Has been up over 4 days

I did not add my reboot logic but I left in a single-line change in ble2mqtt.c that prevents a connect to any device. I am only interested in RSSI from iBeacons. From ble2mqtt.c: static void ble_on_device_discovered(mac_addr_t mac, int rssi) { uint8_t connect = config_ble_should_connect(mactoa(mac)); connect = 0; / jfc /

Seems like more iBeacons leads to more disconnects....?

Am wondering if the time spent scanning for BLE devices could be modified to reduce potential conflict over the wireless modem? Also, this message keeps appearing about every 15 minutes:

2020-08-14 21:01:33 E (198003267) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc 2020-08-14 21:16:33 E (198903267) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc 2020-08-14 21:31:32 E (199803277) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc 2020-08-14 21:46:32 E (200703287) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc 2020-08-14 22:01:32 E (201603297) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc 2020-08-14 22:16:32 E (202503307) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc 2020-08-14 22:31:32 E (203403307) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc 2020-08-14 22:46:32 E (204303317) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc

More as it comes. Is there anything more info that would help? Jim

shmuelzon commented 4 years ago

Unfortunately, I’ve recently blown up my PC so there isn’t much I can do at the moment. I’ve also updated a single device at home, connected to two peripherals and no beacons. Like your findings, it also seems to hang daily but I don’t have access to the logs to compare them with yours. We can try again if/when the official version is out. One thing that might help is removing the configuration options in sdkconfig.default that allow more than 4 connections. Since you don’t connect to many (any?) peripherals, there is no reason to take up more memory. Maybe that configuration causes issues...

BTW - if you don’t want it to connect to any devices, you can always set an empty whitelist without modifying the code

jamesfcurran commented 4 years ago

Sorry about your computer!

I'll switch to the empty whitelist - would rather not touch the code....but: If the IDE can't get this dual-function capability right maybe it would be better for the app to listen on BT for a configurable number of seconds and then connect to the wireless and dump what it heard to the MQTT server and disconnect from the wireless and repeat. Not sure if access points would object to that behavior.

I'll alter the settings in sdkconfig as you recommend in any case and see what happens. Actually there are quite a few BLE/BT settings to understand.

Finally, although it does not contain a lockup, here is a log file with a few days of operation under the new software. It does show some network events but the system appears to recover gracefully and it continues to report valid data as of this morning.

esp-link.txt

Jim

shmuelzon commented 4 years ago

Thanks, I'll try to understand what the btu_hcif_hdl_command_complete opcode 0x2005 status 0xc messages are.

I think that WiFi coexistence is a must have for this kind of device and I remember they put a lot of effort into it. It used to be done in hardware and I think the default now is in software as they wanted to tweak it. Perhaps it's worth a try to enable it in hardware again (if that option still exists). Managing coexistence in the application might be OK for the beacon-only use-case but not when connecting to peripherals. First, you don't want to keep connecting and disconnecting from them, I have an (unsubstantiated) feeling it drains the battery quicker and you might miss notifications. The same with regards to disconnecting from WiFi, you might miss MQTT messages that are meant for you and not necessarily retained.

Thanks again for putting the effort into testing and debugging this, I do appreciate it!

jamesfcurran commented 4 years ago

Well here's an update. First, I agree with your points above regarding loss of real-time connectivity. It is just frustrating that this is such a challenge for the system.

Two postings got my attention as relevant:

https://www.gitmemory.com/issue/espressif/esp-idf/5539/654136725 https://www.esp32.com/viewtopic.php?f=13&t=6707&start=30

Again, only caring about catching broadcasts, I changed the scan_interval to 0x80 and the scan_window to 0x10 in ble.c. While I see the ESP32s disconnect frequently, they do come back on line on their own pretty much every time.

My (unsubstantiated) theory is that the devices have trouble dealing intermittent WiFi and competition for the radio compounds the issue. Below is a log from a device I am monitoring. The interesting stuff is backeted by the ever-present BT_HCI:... messages appearing every 15 minutes. Notice the crowded timestamps at 11:14 from the various event handlers? The device recovered on its own after 2 minutes of thrashing.

Anyway - hope this helps. Jim

2020-08-28 09:58:40 E (576904373) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc 2020-08-28 10:13:40 E (577804383) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc 2020-08-28 10:28:40 E (578704383) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc 2020-08-28 10:43:40 E (579604393) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc 2020-08-28 10:58:40 E (580504403) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc 2020-08-28 11:13:40 E (581404403) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc 2020-08-28 11:14:12 I (581435703) wifi:bcn_timout,ap_probe_send_start 2020-08-28 11:14:14 I (581438203) wifi:ap_probe_send over, resett wifi status to disassoc 2020-08-28 11:14:14 I (581438203) wifi:state: run -> init (c800) 2020-08-28 11:14:14 I (581438203) wifi:pm stop, total sleep time: 4959970131 us / 5387496020 us 2020-08-28 11:14:14 2020-08-28 11:14:14 I (581438213) wifi:new:<6,0>, old:<6,0>, ap:<255,255>, sta:<6,0>, prof:1 2020-08-28 11:14:14 I (581438223) WiFi: Disconnected 2020-08-28 11:14:14 I (581438223) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:14 E (581438223) TRANS_TCP: tcp_poll_read select error 113, errno = Software caused connection abort, fd = 54 2020-08-28 11:14:14 I (581438233) MQTT: Disconnecting MQTT client 2020-08-28 11:14:14 E (581438243) MQTT_CLIENT: Poll read error: 0, aborting connection 2020-08-28 11:14:14 E (581438253) TRANS_TCP: tcp_poll_write select error 0, errno = Success, fd = 54 2020-08-28 11:14:14 E (581438253) MQTT_CLIENT: Error write data or timeout, written len = -1, errno=0 2020-08-28 11:14:14 E (581438263) MQTT_CLIENT: Error sending disconnect message 2020-08-28 11:14:14 I (581438273) MQTT: MQTT client disconnected 2020-08-28 11:14:14 I (581438283) BLE2MQTT: Disconnected from MQTT, stopping BLE 2020-08-28 11:14:16 I (581439913) WiFi: Disconnected 2020-08-28 11:14:16 I (581439913) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:16 I (581439913) MQTT: Disconnecting MQTT client 2020-08-28 11:14:17 I (581441493) WiFi: Disconnected 2020-08-28 11:14:17 I (581441493) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:17 I (581441493) MQTT: Disconnecting MQTT client 2020-08-28 11:14:19 I (581443063) WiFi: Disconnected 2020-08-28 11:14:19 I (581443063) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:19 I (581443063) MQTT: Disconnecting MQTT client 2020-08-28 11:14:20 I (581444633) WiFi: Disconnected 2020-08-28 11:14:20 I (581444633) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:20 I (581444643) MQTT: Disconnecting MQTT client 2020-08-28 11:14:22 I (581446213) WiFi: Disconnected 2020-08-28 11:14:22 I (581446213) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:22 I (581446213) MQTT: Disconnecting MQTT client 2020-08-28 11:14:24 I (581447783) WiFi: Disconnected 2020-08-28 11:14:24 I (581447783) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:24 I (581447783) MQTT: Disconnecting MQTT client 2020-08-28 11:14:25 I (581449363) WiFi: Disconnected 2020-08-28 11:14:25 I (581449363) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:25 I (581449363) MQTT: Disconnecting MQTT client 2020-08-28 11:14:27 I (581450933) WiFi: Disconnected 2020-08-28 11:14:27 I (581450933) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:27 I (581450933) MQTT: Disconnecting MQTT client 2020-08-28 11:14:28 I (581452513) WiFi: Disconnected 2020-08-28 11:14:28 I (581452513) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:28 I (581452513) MQTT: Disconnecting MQTT client 2020-08-28 11:14:30 I (581454083) WiFi: Disconnected 2020-08-28 11:14:30 I (581454083) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:30 I (581454083) MQTT: Disconnecting MQTT client 2020-08-28 11:14:31 I (581455653) WiFi: Disconnected 2020-08-28 11:14:32 I (581455653) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:32 I (581455653) MQTT: Disconnecting MQTT client 2020-08-28 11:14:33 I (581457233) WiFi: Disconnected 2020-08-28 11:14:33 I (581457233) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:33 I (581457233) MQTT: Disconnecting MQTT client 2020-08-28 11:14:35 I (581458803) WiFi: Disconnected 2020-08-28 11:14:35 I (581458803) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:35 I (581458803) MQTT: Disconnecting MQTT client 2020-08-28 11:14:36 I (581460383) WiFi: Disconnected 2020-08-28 11:14:36 I (581460383) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:36 I (581460383) MQTT: Disconnecting MQTT client 2020-08-28 11:14:38 I (581461953) WiFi: Disconnected 2020-08-28 11:14:38 I (581461953) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:38 I (581461953) MQTT: Disconnecting MQTT client 2020-08-28 11:14:39 I (581463523) WiFi: Disconnected 2020-08-28 11:14:39 I (581463523) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:39 I (581463533) MQTT: Disconnecting MQTT client 2020-08-28 11:14:41 I (581465103) WiFi: Disconnected 2020-08-28 11:14:41 I (581465103) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:41 I (581465103) MQTT: Disconnecting MQTT client 2020-08-28 11:14:43 I (581466673) WiFi: Disconnected 2020-08-28 11:14:43 I (581466673) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:43 I (581466673) MQTT: Disconnecting MQTT client 2020-08-28 11:14:44 I (581468253) WiFi: Disconnected 2020-08-28 11:14:44 I (581468253) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:44 I (581468253) MQTT: Disconnecting MQTT client 2020-08-28 11:14:46 I (581469823) WiFi: Disconnected 2020-08-28 11:14:46 I (581469823) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:46 I (581469823) MQTT: Disconnecting MQTT client 2020-08-28 11:14:47 I (581471393) WiFi: Disconnected 2020-08-28 11:14:47 I (581471403) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:47 I (581471403) MQTT: Disconnecting MQTT client 2020-08-28 11:14:49 I (581472973) WiFi: Disconnected 2020-08-28 11:14:49 I (581472973) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:49 I (581472973) MQTT: Disconnecting MQTT client 2020-08-28 11:14:50 I (581474543) WiFi: Disconnected 2020-08-28 11:14:50 I (581474543) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:50 I (581474543) MQTT: Disconnecting MQTT client 2020-08-28 11:14:52 I (581476123) WiFi: Disconnected 2020-08-28 11:14:52 I (581476123) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:52 I (581476123) MQTT: Disconnecting MQTT client 2020-08-28 11:14:54 I (581477693) WiFi: Disconnected 2020-08-28 11:14:54 I (581477693) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:54 I (581477693) MQTT: Disconnecting MQTT client 2020-08-28 11:14:55 I (581479263) WiFi: Disconnected 2020-08-28 11:14:55 I (581479273) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:55 I (581479273) MQTT: Disconnecting MQTT client 2020-08-28 11:14:57 I (581480843) WiFi: Disconnected 2020-08-28 11:14:57 I (581480843) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:57 I (581480843) MQTT: Disconnecting MQTT client 2020-08-28 11:14:58 I (581482413) WiFi: Disconnected 2020-08-28 11:14:58 I (581482413) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:14:58 I (581482413) MQTT: Disconnecting MQTT client 2020-08-28 11:15:00 I (581483993) WiFi: Disconnected 2020-08-28 11:15:00 I (581483993) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:00 I (581483993) MQTT: Disconnecting MQTT client 2020-08-28 11:15:01 I (581485563) WiFi: Disconnected 2020-08-28 11:15:01 I (581485563) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:01 I (581485563) MQTT: Disconnecting MQTT client 2020-08-28 11:15:03 I (581487143) WiFi: Disconnected 2020-08-28 11:15:03 I (581487143) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:03 I (581487143) MQTT: Disconnecting MQTT client 2020-08-28 11:15:05 I (581488713) WiFi: Disconnected 2020-08-28 11:15:05 I (581488713) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:05 I (581488713) MQTT: Disconnecting MQTT client 2020-08-28 11:15:06 I (581490283) WiFi: Disconnected 2020-08-28 11:15:06 I (581490283) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:06 I (581490283) MQTT: Disconnecting MQTT client 2020-08-28 11:15:08 I (581491863) WiFi: Disconnected 2020-08-28 11:15:08 I (581491863) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:08 I (581491863) MQTT: Disconnecting MQTT client 2020-08-28 11:15:09 I (581493433) WiFi: Disconnected 2020-08-28 11:15:09 I (581493433) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:09 I (581493433) MQTT: Disconnecting MQTT client 2020-08-28 11:15:11 I (581495013) WiFi: Disconnected 2020-08-28 11:15:11 I (581495013) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:11 I (581495013) MQTT: Disconnecting MQTT client 2020-08-28 11:15:12 I (581496583) WiFi: Disconnected 2020-08-28 11:15:12 I (581496583) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:12 I (581496583) MQTT: Disconnecting MQTT client 2020-08-28 11:15:14 I (581498153) WiFi: Disconnected 2020-08-28 11:15:14 I (581498153) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:14 I (581498153) MQTT: Disconnecting MQTT client 2020-08-28 11:15:16 I (581499733) WiFi: Disconnected 2020-08-28 11:15:16 I (581499733) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:16 I (581499733) MQTT: Disconnecting MQTT client 2020-08-28 11:15:17 I (581501303) WiFi: Disconnected 2020-08-28 11:15:17 I (581501303) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:17 I (581501303) MQTT: Disconnecting MQTT client 2020-08-28 11:15:19 I (581502883) WiFi: Disconnected 2020-08-28 11:15:19 I (581502883) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:19 I (581502883) MQTT: Disconnecting MQTT client 2020-08-28 11:15:20 I (581504453) WiFi: Disconnected 2020-08-28 11:15:20 I (581504453) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:20 I (581504453) MQTT: Disconnecting MQTT client 2020-08-28 11:15:22 I (581506023) WiFi: Disconnected 2020-08-28 11:15:22 I (581506033) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:22 I (581506033) MQTT: Disconnecting MQTT client 2020-08-28 11:15:23 I (581507603) WiFi: Disconnected 2020-08-28 11:15:23 I (581507603) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:23 I (581507603) MQTT: Disconnecting MQTT client 2020-08-28 11:15:25 I (581509173) WiFi: Disconnected 2020-08-28 11:15:25 I (581509173) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:25 I (581509173) MQTT: Disconnecting MQTT client 2020-08-28 11:15:27 I (581510753) WiFi: Disconnected 2020-08-28 11:15:27 I (581510753) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:27 I (581510753) MQTT: Disconnecting MQTT client 2020-08-28 11:15:28 I (581512083) wifi:new:<11,0>, old:<6,0>, ap:<255,255>, sta:<11,0>, prof:1 2020-08-28 11:15:28 I (581512083) wifi:state: init -> auth (b0) 2020-08-28 11:15:28 I (581512093) wifi:state: auth -> assoc (0) 2020-08-28 11:15:29 I (581513093) wifi:state: assoc -> init (400) 2020-08-28 11:15:29 I (581513093) wifi:new:<11,0>, old:<11,0>, ap:<255,255>, sta:<11,0>, prof:1 2020-08-28 11:15:29 I (581513093) WiFi: Disconnected 2020-08-28 11:15:29 I (581513093) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:29 I (581513103) MQTT: Disconnecting MQTT client 2020-08-28 11:15:30 I (581514673) WiFi: Disconnected 2020-08-28 11:15:31 I (581514673) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:31 I (581514673) MQTT: Disconnecting MQTT client 2020-08-28 11:15:32 I (581516243) WiFi: Disconnected 2020-08-28 11:15:32 I (581516243) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:32 I (581516243) MQTT: Disconnecting MQTT client 2020-08-28 11:15:34 I (581517813) WiFi: Disconnected 2020-08-28 11:15:34 I (581517823) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:34 I (581517823) MQTT: Disconnecting MQTT client 2020-08-28 11:15:35 I (581519393) WiFi: Disconnected 2020-08-28 11:15:35 I (581519393) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:35 I (581519393) MQTT: Disconnecting MQTT client 2020-08-28 11:15:37 I (581520963) WiFi: Disconnected 2020-08-28 11:15:37 I (581520963) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:37 I (581520963) MQTT: Disconnecting MQTT client 2020-08-28 11:15:38 I (581522543) WiFi: Disconnected 2020-08-28 11:15:38 I (581522543) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:38 I (581522543) MQTT: Disconnecting MQTT client 2020-08-28 11:15:40 I (581524113) WiFi: Disconnected 2020-08-28 11:15:40 I (581524113) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:40 I (581524113) MQTT: Disconnecting MQTT client 2020-08-28 11:15:42 I (581525693) WiFi: Disconnected 2020-08-28 11:15:42 I (581525693) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:42 I (581525693) MQTT: Disconnecting MQTT client 2020-08-28 11:15:43 I (581527263) WiFi: Disconnected 2020-08-28 11:15:44 I (581527263) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:44 I (581527263) MQTT: Disconnecting MQTT client 2020-08-28 11:15:45 I (581528843) WiFi: Disconnected 2020-08-28 11:15:45 I (581528843) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:45 I (581528843) MQTT: Disconnecting MQTT client 2020-08-28 11:15:46 I (581530413) WiFi: Disconnected 2020-08-28 11:15:46 I (581530413) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:46 I (581530413) MQTT: Disconnecting MQTT client 2020-08-28 11:15:46 I (581530533) wifi:new:<6,0>, old:<11,0>, ap:<255,255>, sta:<6,0>, prof:1 2020-08-28 11:15:46 I (581530543) wifi:state: init -> auth (b0) 2020-08-28 11:15:47 I (581531543) wifi:state: auth -> init (200) 2020-08-28 11:15:47 I (581531543) wifi:new:<6,0>, old:<6,0>, ap:<255,255>, sta:<6,0>, prof:1 2020-08-28 11:15:47 I (581531543) WiFi: Disconnected 2020-08-28 11:15:47 I (581531543) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:47 I (581531553) MQTT: Disconnecting MQTT client 2020-08-28 11:15:49 I (581533123) WiFi: Disconnected 2020-08-28 11:15:49 I (581533123) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:49 I (581533123) MQTT: Disconnecting MQTT client 2020-08-28 11:15:51 I (581534693) WiFi: Disconnected 2020-08-28 11:15:51 I (581534693) BLE2MQTT: Disconnected from the network, stopping MQTT 2020-08-28 11:15:51 I (581534693) MQTT: Disconnecting MQTT client 2020-08-28 11:15:51 I (581534813) wifi:new:<6,0>, old:<6,0>, ap:<255,255>, sta:<6,0>, prof:1 2020-08-28 11:15:51 I (581534813) wifi:state: init -> auth (b0) 2020-08-28 11:15:51 I (581534823) wifi:state: auth -> assoc (0) 2020-08-28 11:15:51 I (581534833) wifi:state: assoc -> run (10) 2020-08-28 11:15:51 I (581535073) wifi:connected with eRetailing-IOT, aid = 2, channel 6, BW20, bssid = 7c:5a:1c:45:17:0a 2020-08-28 11:15:51 I (581535073) wifi:security: WPA2-PSK, phy: bgn, rssi: -49 2020-08-28 11:15:51 I (581535073) wifi:pm start, type: 1 2020-08-28 11:15:51 2020-08-28 11:15:51 I (581535083) WiFi: Connected 2020-08-28 11:15:51 I (581535203) wifi:AP's beacon interval = 102400 us, DTIM period = 1 2020-08-28 11:15:52 I (581536023) esp_netif_handlers: sta ip: 10.0.12.6, mask: 255.255.255.0, gw: 10.0.12.1 2020-08-28 11:15:52 I (581536023) BLE2MQTT: Connected to the network, connecting to MQTT 2020-08-28 11:15:52 I (581536023) MQTT: Connecting MQTT client 2020-08-28 11:15:52 I (581536033) system_api: Base MAC address is not set, read default base MAC address from BLK0 of EFUSE 2020-08-28 11:15:52 I (581536063) MQTT: MQTT client connected 2020-08-28 11:15:52 I (581536063) BLE2MQTT: Connected to MQTT, scanning for BLE devices 2020-08-28 11:28:40 E (582304413) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc 2020-08-28 11:43:40 E (583204423) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc 2020-08-28 11:58:40 E (584104423) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc 2020-08-28 12:13:40 E (585004433) BT_HCI: btu_hcif_hdl_command_complete opcode 0x2005 status 0xc

shmuelzon commented 4 years ago

So, you believe this all revolves around WiFi and BLE coexistence? Changing the scanning window would affect that but there are still many error messages here. Maybe it would be better using hardware coexisting again (I enabled that a few IDF versions ago but switch to SW after it seemed stable enough). I need to set up a development environment again and play around with these settings...

shmuelzon commented 3 years ago

Hey all,

I've updated the project to support ESP-IDF v4.1 which, I hope, brings more fixes than bugs. In addition I've modified the BLE scanning parameters to values that should allow WiFi and BLE coexistence to work better. I've also added a workaround behavior that would restart WiFi when the connection to the MQTT broker fails 3 times. That should allow the application to resume proper function faster rather than waiting ~30 minutes in some cases.

Please try the new version when you can and let me know if the issues described here still persist.

Thanks!

nubicula commented 3 years ago

I assume my issue is likely the same or similar. At some point, the device drops from the network and does not reconnect. Sometimes after a matter of hours, sometimes after at least a full day. I see you have the workaround for restarting the WiFi, but would it be possible to also add reboot esp32 if still unable to connect after the wifi is restarted, as everything seems to work fine again immediately after a reboot.

shmuelzon commented 3 years ago

The problem is understanding that WiFi is disconnected. The application should receive an event telling it WiFi was disconnected, but it doesn't always. One way I can detect it is when the MQTT client gets disconnected and cannot reconnect 3 times letting me assume there's an issue with the WiFi connection and I can force it to reconnect. Are you telling me that in your case we don't even get any indication from the MQTT client that it disconnects? So both the ESP WiFi stack and MQTT think they're still connected? As far as I know, MQTT has a keep-alive mechanism where it constantly pings the broker so when the network is down it should detect that it was disconnected. Do you have any logs from when your device enters this state?

nubicula commented 3 years ago

At present. I have no logs. Which type would you prefer, just the remote logging, or enable anything from menuconfig. I'll set up either way you need now, and leave until next occurrence. All I am aware of is that at some point the uptime doesn't change and there are no updates to ble devices that normally publish from a notify every minute or two, or to any MQTT Publishes. And of course, thanks again to all assistance and this great project.

shmuelzon commented 3 years ago

I would be happy to get debug logs by opening the configuration menu with idf.py menuconfig. Search, using the / key, for the LOG_DEFAULT_LEVEL choice menu and select DEBUG. Remote logging won't help when the device isn't connected to WiFi so it'll need to be with a serial connection. Thanks, and thank you for putting the time to help make it better!

nubicula commented 3 years ago

Apologies, do you have a reference on connecting the serial logger?

shmuelzon commented 3 years ago

I'm assuming you flashed the application to your device via idf.py -p <serial device> flash. So it should be something like idf.py -p <serial device> monitor.

nubicula commented 3 years ago

Ok, so re-flashed with debug enabled and connected serial monitor and recording to file. Will leave until I notice failure.

nubicula commented 3 years ago

Redoing log file again, as seems not to be working at all at the moment, but may be physical distance at the moment, as for serial monitoring had to bring esp32 closer to pc.

nubicula commented 3 years ago

Hope this helps.. log.ble2mqtt.20210106142907.txt

shmuelzon commented 3 years ago

Thanks for that. So, the first thing I see is that you have many crashes along the way. I think some of them are due to the combination of debug logs and the remote logging mechanism. Maybe it would be better to disable remote logging while debugging for now. Once or twice it also seemed to be because we ran out of memory, but I can't be sure it's due to the same reason.

As for the issue at hand, I don't have any good news about it. Assuming the device simply halts and what you sent is the end of the log, it seems as if the WiFi stack is aware of the fact it disconnected, but it hangs at some point and doesn't send the event to the application. You can search the logs for WIFI_EVENT:5, when it doesn't crash you will see 3 of them at a time as 3 entities have registered on this event. The first is: D (4556) event: running post WIFI_EVENT:5 with handler 0x40149074 and context 0x3ffd08b4 on loop 0x3ffd0184 That calls wifi_default_action_sta_disconnected(). The second is: D (4616) event: running post WIFI_EVENT:5 with handler 0x400db084 and context 0x3ffb7be0 on loop 0x3ffd0184 That calls this application at event_handler() And the third is: D (4636) event: running post WIFI_EVENT:5 with handler 0x4012c398 and context 0x3ffda880 on loop 0x3ffd0184 Which is the mDNS task. Anyways, when it hangs, all I see is:

D (34866) event: running post WIFI_EVENT:5 with handler 0x40149074 and context 0x3ffd08b4 on loop 0x3ffd0184
0x40149074: wifi_default_action_sta_disconnected at /home/artemis/esp/esp-idf/components/esp_wifi/src/wifi_default.c:107
Failed sending remote log: -1
D (34886) esp_netif_handlers: esp_netif action disconnected with netif0x3ffd05ac from event_id=5
Failed sending remote log: -1
D (34896) esp_netif_lwip: check: remote, if=0x3ffd05ac fn=0x4013a0ec
0x4013a0ec: esp_netif_down_api at /home/artemis/esp/esp-idf/components/esp_netif/lwip/esp_netif_lwip.c:1207

It get's stuck somewhere in the first handler and so the second handler (this app) is never notified that WiFi was disconnected. I'll try to search for this issue online but, for now, I don't have much to go on...

nubicula commented 3 years ago

Hello sir, not sure if it will help, but managed to get another extract with remote logging disabled. log.ble2mqtt.20210111083818.txt

shmuelzon commented 3 years ago

Thanks for the log, but I don't see the issue there. All of the WiFi disconnections were handled and the device didn't hang. Were you able to reproduce the issue with the logs you attached?

ratsept commented 2 years ago

Hi,

I seem to be running into some issue very similar to those described here. What seems to have happened is over the weekend the system was left running and it got disconnected from WiFi for whatever reason. From the log it looks like this was reported to the application and it behaved as it should by doing the cleanup. But then it just kept looping that Disconnect - Stop MQTT thing for the whole weekend. Just rebooting the ESP module seems to fix whatever loop it was caught in. All this is in crash_2.log

If I just unplug my WiFi router another issue appears that is also causing the device to fail to appear on the network until rebooted. When I reboot the router the ESP disconnect and reports this to the application like it should. Then everything is cleaned up and I assume WiFi will sit there waiting to reconnect forever. So I plug in the router again and at some point the ESP will try to connect again but will fail authentication. It seems to only try this 3 times and then stop and just sit there forever (I obviously didn't wait forever but it didn't retry connecting withing a few minutes). Rebooting the ESP (with the same authentication keys and code) seems to fix it. This is in crash_3.log

crash_2.log crash_3.log