Closed danergo closed 18 hours ago
Regarding the issue "ESP32 did not send the Sent Write Requests to Handle: 0x0092," I will add more logging information. Please allow me some time to do so.
Additionally, could you provide a demo that reproduces the problem on your end? It would be even better if the other device is also an ESP.I want to reproduce the issue locally.
Demo is very simple: listen to LE Ads from 8 temperature sensing devices (passive scanning without any filtering). And occasionally connect to this peer.
Slight addition to sniffer vs hci log: sniffer log was started a little bit later, so it misses 2 connection trials to this peer.
Based on the provided HCI log: There were a total of 37 Sent Write Requests to Handle: 0x0092, but only 25 Received Write Responses. Therefore, 37 - 25 = 12 Write Responses were not received.
The problem here is the same IMHO: 12 Write Requests haven't sent out for some unknown reason. This is 12 only, because we stopped the instance (anyway it could be 12000 also). So:
From the sniffer log analysis: There were 23 Sent Write Requests to Handle: 0x0092, and 21 Received Write Responses for Handle: 0x0092. This suggests that the peer device may have failed to respond to 23-21 = 2 Write Requests
We wanted to send 37 Write Requests (based on HCI log), and sniff log contains only 23 Write Request. To compensate the initial delayed start, 37 - 23 - 2 = 12. This is the amount that doesn't show up in the sniffer log, making high probability that it's not even sent out.
Anyway, 1-2 response failure can be accepted: our app will retry connecting once we don't receive response within a certain time-frame. 1-2 can be caused by radio interference too, so I wouldn't take it as serious.
packet loss can occur with your sniffer device
That's unlikely, sniffer is actually very close (3m with clear line-of-sight) to ESP.
I will add more logging information. Please allow me some time to do so.
Sure, thank you very much.
Could you provide the example? If possible, I will use your example to try and reproduce the issue.thanks.
It will be very hard to reproduce it without the exact peer, unfortunately.
I think it will be faster to validate on our side. It might be some location preference too.
So even if you would try to reproduce it on your end, there shall be 'fake' peer(s) which might result different results.
If you agree, I wish to keep the test as it is, because it is 100% reproducible (although a bit slow).
@esp-zhp:
I will add more logging information. Please allow me some time to do so.
Do you have any idea about the timing of this?
Thank you!
Could you run a test where, 1-after establishing a connection, the connection parameters are updated? Once the update is completed, perform a write operation.
2-If the "Sent Write Request, Handle: 0x0092" is not successfully sent, the application layer should retry after 5 seconds instead of immediately disconnecting. If the second attempt also fails, disconnect after another 5 seconds.
Additionally, I’ll provide you with a library that will print some HCI data for debugging purposes. libbtdm_app.zip
I’m still adding more logs, and I’ll send you an updated version of the library later(about one or two day later).
Thank you, we are started the test with this new libbtdm_app. Let's see, will provide you the logs, once it fails.
Filenames to share (myself-2-remember):
minicom_memleak_2412862179.cap
btmon_memleak_2412862179.pcap
sniff_memleak_2412862179.pcap
@danergo ok, I will provide you with the updated library later.
@danergo Do you have the test results on your side?
@esp-zhp: not yet!
It is running smoothly since [15/10/2024 10:13:55.481] (first logline's date from our app).
It logs constantly to debug port (your raw data IMHO). This might cause some timing delay in normal code execution which prevents this bug to appear. I'm not sure, just guessing, although we have never seen this module reach this far.
In case we won't have issues until end of this week, we will add more devices (both advertising, and classic bt) to see the performance.
Or, in case you have any suggestions, we are more than open to hear those.
Thank you
From the sniffer log and HCI log you provided, it seems that there is an issue with the ESP32.
Additionally, I noticed that the "Sent Write Request, Handle: 0x0092" was not omitted but rather delayed, as it was only sent when you disconnected.
Moreover, every time an issue occurs, there is a connection parameter update present. So, I asked you to perform the following test changes:
1-After establishing a connection, update the connection parameters. Once the update is completed, perform a write operation.
2-If the "Sent Write Request, Handle: 0x0092" is not successfully sent, the application layer should retry after 5 seconds instead of immediately disconnecting. If the second attempt also fails, disconnect after another 5 seconds.
You conducted tests for a long time, and the issue did not reappear for the following two reasons:
1-The changes I requested altered the logic of when the problem occurred. 2-The logs I added affected some timing relationships, causing the issue to no longer occur. I tend to favor the first reason because I believe the problem may be related to the connection parameter updates. You could try modifying your code logic without using my library to see if the issue still exists.
Did you change your code logic (as I say) in your most recent test? ps:sniffer log
@esp-zhp:
Did you change your code logic (as I say) in your most recent test?
We didn't change anything, since realizing that long GATT Write Requests are not being sent out.
Now, the code is still running without any single missed Write Request (or Indication).
We have never seen reaching this far with this configuration.
1-The changes I requested altered the logic of when the problem occurred.
So you did change the logic which correlates how these GATT Writes & Connection Updates behave?
Can you please provide me a new library with this change added and verbose logging disabled (or even better: make it configurable with API)?
Thank you!
@esp-zhp: still running without any single failure (all gatt write requests are sent out, and all indications are received).
Can you please provide me a new library with this change added and verbose logging disabled (or even better: make it configurable with API)?
@esp-zhp: still up, with 0 issues! :)
@esp-zhp:
Can you please provide me a new library with this change added and verbose logging disabled (or even better: make it configurable with API)?
I believe this now should be considered as fixed. Although we have to make extra sure that this fix is not caused by the verbose logging which adds load on cpu and (might) slightly modifies the scheduler.
2024-10-25_102830.zip void esp_log_enable_hcievt_log(bool enable); void esp_log_enable_hcicmd_log(bool enable);
Hi, @esp-zhp:
I replaced the lib with your version:
~/esp-idf$ ll components/bt/controller/lib_esp32/esp32/
total 2284
drwxrwxr-x 2 daniel daniel 4096 okt 26 08:45 ./
drwxrwxr-x 3 daniel daniel 4096 okt 2 17:23 ../
-rw------- 1 daniel daniel 776612 okt 25 04:28 libbtdm_app.a
-rw------- 1 daniel daniel 774888 okt 14 15:02 libbtdm_app.a.logging
-rw-rw-r-- 1 daniel daniel 773808 okt 2 17:23 libbtdm_app.a.old
Then I tried to recompile, and it fails at the last step:
controller_hci_uart_esp32/main/controller_hci_uart_demo.c:63:5: error: implicit declaration of function 'esp_log_enable_hcievt_log' [-Wimplicit-function-declaration]
63 | esp_log_enable_hcievt_log(false);
| ^~~~~~~~~~~~~~~~~~~~~~~~~
This is the addition in our example compared to the stock "controller_hci_uart_demo" in this repo:
esp_log_enable_hcievt_log(false);
esp_log_enable_hcicmd_log(false);
while (1) {
extern uint16_t ke_get_heap_free_size(void);
uint16_t free_size;
free_size = ke_get_heap_free_size();
ESP_LOGW(tag, "Free heap size=%d bytes", free_size);
vTaskDelay(10000/portTICK_PERIOD_MS);
}
Compilation fails with the "esp_log_enable..." function.
Can you please double check? Maybe we need to import something else?
I test it,it works. extern void esp_log_enable_hcievt_log(bool enable); extern void esp_log_enable_hcicmd_log(bool enable); esp_log_enable_hcievt_log(0); esp_log_enable_hcicmd_log(0); log:
Thank you, compiled, and works.
1 day passed without any issue after retesting began.
@esp-zhp, it failed again, (with logs disabled). So I believe this is indeed some timing issue.
Will share you the logs soon.
@esp-zhp: logs shared.
@danergo I think this might be related to the connection parameter update. Could you run a test where, 1-after establishing a connection, update the connection parameters. wait the update is completed,and then perform a write operation.
2-If the "Sent Write Request, Handle: 0x0092" is not successfully sent, the application layer should retry after 5 seconds instead of immediately disconnecting. If the second attempt also fails, disconnect after another 5 seconds.
3-don‘t use my lib, origin lib is ok
thanks.
@esp-zhp:
My code has no influence or knowledge of L2CAP messages.
It just:
Moreover, in case I try manually (with gatttool): this means im typing manually, no script involved here: 1.) Connect, then wait until connected 2.) Mtu 247 (gatttool denies client's mtu increase request, but this command sets it correctly), wait for confirm 3.) Char-write-req 0x0093 0200, wait for confirm 4.) Char-write-req long
This WORKS for 10-12hrs, but then suddenly long writes are not sent out anymore, UNTIL esp gets reset. And as you can see gatttool has no info on l2cap.
Now I'm more than sure gatttool is not changing anything under the hood after 10-12 hrs, moreover I don't have to restart the Linux, only the esp to get back to normal.
Moreover in case we enable the verbose log in your library, it hides this problem.
Again, this is something we can't control, and now it's more than likely that esp behaves different after running 10-12 hrs.
We can solve this by either running it wit verbose log, or restart it every hour. Neither is too effective.
What do you think?
you may not have understood what I meant. 1-I've noticed the issue: your write request wasn't sent as expected. 2-I suspect this is related to the connection parameter update. so I'd like you to wait until the connection parameter update completes before performing the write operation, rather than writing immediately.-If the "Sent Write Request, Handle: 0x0092" is not successfully sent, the application layer should retry after 5 seconds instead of immediately disconnecting. If the second attempt also fails, disconnect after another 5 seconds.
-> I believe this approach might solve your issue.
Dear @esp-zhp:
I understand your points, let me rephrase my concerns here, reacting on yours:
1-I've noticed the issue: your write request wasn't sent as expected.
Please see these two below screenshots, taking into consideration my labels and explanations:
(For others: each screenshot consist of two logs: upper is recorded with a sniffer, and bottom is logged directly on the RPi which ESP32 is attached to)
Good commincation trial steps: | # | What? | Source | Destination | SniffID | DirectID |
---|---|---|---|---|---|---|
1 | Exchange MTU Request | from Slave | to Master | 2442 | 5228 | |
2 | Exchange MTU Response | from Master | to Slave | 2446 | 5230 | |
3 | Write Request (0x0093) | from Master | to Slave | 2448 | 5231 | |
4 | Connection Parameter update request | from Slave | to Master | 2449 | 5234 | |
5 | Write Response (for 0x0093) | from Slave | to Master | 2452 | 5241(!!!) Look at this! | |
6 | Connection Parameter update response | from Master! | to Slave | 2453 | 5237(!!!) | |
7 | Write Request (0x0092) | from Master | to Slave | 2457 | 5242 |
Key takeaways for good trial:
Okay, let's move on to the stage when ESP starts behaving wrong:
Failed commincation trial steps: | # | What? | Source | Destination | SniffID | DirectID |
---|---|---|---|---|---|---|
1 | Exchange MTU Request | from Slave | to Master | 106498 | 202392 | |
2 | Exchange MTU Response | from Master | to Slave | 106503 | 202394 | |
3 | Write Request (0x0093) | from Master | to Slave | 106505 | 202395 | |
4 | Connection Parameter update request | from Slave | to Master | 106506 | 202399 | |
5 | Write Response (for 0x0093) | from Slave | to Master | 106508 | 202406(!!!) Look at this! | |
6 | Connection Parameter update response | from Master! | to Slave | 106509 | 202402(!!!) | |
7 | Write Request (0x0092) | from Master | to Slave | !!MISSING!! | 202407 |
Key takeaways for failed trial:
Additional notes, regarding your recommendation of
If the "Sent Write Request, Handle: 0x0092" is not successfully sent, the application layer should retry after 5 seconds instead of immediately disconnecting.
For 2.) please see this new log (btmon_finalproof.cap, shared with your email too):
I followed these steps now:
Key takeaway: This is the root cause of the problem: after 10-12 hours, ESP's controller does NOT respect MTU changes anymore, leaving the RX buffer on default 63 bytes therefore ESP is NOT able to process any longer messages until it's reset.
I hope you accept this as proof. To me this looks straightforward, but please let me know in case you need any further information/clarification. I really appreciate your help and efforts!
Thank you very much
leaving the RX buffer on default 63 bytes
I meant HCI rx buffer here.
@esp-zhp: did you have time for checking my last details here?
@danergo sorry for later reply,I'm still analyzing the cause. After adding the HCI log, the issue stopped occurring, which complicates debugging. I'll add more debug logs to investigate further.
sniffer: It is evident that the write request was not sent on time. However, it was still sent.
If there is no response within 3 seconds, could you resend the write request?
Unfortunately I can't resend, as the remote Client is disconnecting after about 1sec in case it doesn't hear from us. You only see this terminate command, because my app doesn't receive any notification about client's disconnect, so we implemented a (max)5 second timeout: after 5seconds, it is 100% impossible that we are still connected so it's safe to drop and close the connection.
Client is on battery so it has limited time for communication.
What is strange for me is that 63byte threshold: it really seems some default threshold. I can send anything below 64 length, and nothing from 64bytes length. That should be a great way to start finding some spots :)
Hi, @esp-zhp: do you have any news?
@esp-zhp: i have a new idea:
Creating a firmware in which we can turn on/off logs dynamically via debug uart (i.e.: not hardcode log setting into firmware). In this case we can start with hci logging turned off, then once issue occurs, we can turn on hci logging over debug port, without rebooting esp.
What do you think? Can this help? How shall we read from debug uart without disturbing the btdm controller?
@danergo Sorry for the late reply again.
Based on your description, there is a version with HCI logging enabled. If HCI logging is always on, it works fine, but power consumption will be higher. For now, you can use the library with HCI logging enabled to continue your development.
You mentioned: “Creating a firmware that allows us to dynamically turn logging on and off via the debug UART (i.e., without hardcoding log settings into the firmware). In this case, we can start with HCI logging turned off, and then, if an issue occurs, enable HCI logging through the debug port without rebooting the ESP.”
I’m not sure if this approach will work, as I haven’t yet identified the root cause of the issue. I will update the library and plan to add debug information for suspected areas. When an issue occurs, you can dump this debug information (I’ll provide a new API for this), and I’ll use the data to help resolve the problem.
I’m still working on the debug library, and this may take some time. In the meantime, please proceed with your testing and development.
Okay, thank you!
Hi, @esp-zhp: did you have some time to progress?
We are running our main service with HCI logs enabled, and it's smooth. But I think this shall not be considered as a final solution.
Thank you!
yes,I will give you lib this week
Dear @esp-zhp:
We are sending you a new "proof" capture: this time with:
This contain a LOT of garbage, please forgive us. Please focus on the ATT frames. You will see normal procedures of GATT writes and confirmations. Focus on 0x0092 Write Response!
You'll see: 0x0092 responses are coming until ID#634057 (last Write Response in log). Afterwards, you'll see no more Write Responses (for 0x0092).
As this time we have the debug log enabled, we can state, that simply enabling "HCI verbose log" doesn't solve this problem. (Luckily! At least, logging is not hiding this bug :) )
Please check our logs, maybe you'll see something interesting which can help you with this topic.
Here’s the English translation and refinement:
I analyzed the HCI log file “minicom_newproof.cap” that you provided. Using internal tools, I converted the HCI log into a BTsnoop file format, which is compatible with software like Wireshark and "Wireless Protocol Suite."
At timestamp 654,633, there is an ATT_WRITE_REQ
with handle 147:
ATT_WRITE_REQ 147 Not Mapped 0(C) Unknown [3] 14 00:00:00.000063 2024/11/14 11:29:20.135370 147
I noticed that the controller did not subsequently receive any HCI data related to Attribute Handle: 146
. In other words, the controller did not receive the write request (handle 0x92) issued by the Bluz host. This behavior appears abnormal.
I don’t fully understand this behavior. Why didn’t the host continue to send the write request (handle 0x92)? Or did it send the request, but the controller simply didn’t receive it?
ps:HCI log from ESP32(decoded) I will send you by email to protect your privacy.
From the HCI log on your host side, it appears that Bluz sent a write request (handle 0x92), but the ESP32 controller either did not send the packet or failed to send it promptly. This seems to be an issue on the ESP32 controller's side.
However, from the ESP32 controller's perspective, it did not receive the write request (handle 0x92).
I’m unsure how the HCI log on Bluz was captured or whether it is entirely accurate. If this issue occurs, could restarting Bluz while keeping the ESP32 controller running resolve the problem?
@esp-zhp:
I noticed that the controller did not subsequently receive any HCI data related to Attribute Handle: 146. In other words, the controller did not receive the write request (handle 0x92) issued by the Bluz host. This behavior appears abnormal. I don’t fully understand this behavior. Why didn’t the host continue to send the write request (handle 0x92)? Or did it send the request, but the controller simply didn’t receive it?
That's exactly the root cause of the problem! Please note: all ATT 146 messages are long (longer than 63 bytes).
I have tested this many times: after ESP (or BlueZ) gets into this problematic stage, any write request (ATT handle doesn't matter) with data length above 63 won't send out (anything below or equal 63 will be sent out perfectly)!
Let's try to identify the wrong part here: 1.) BlueZ doesn't forwards the 146: can be, but highly unlikely, as btmon (part of BlueZ) still logs the long gatt writes even during the problematic stage. 2.) ESP doesn't receive the 146: can be, as there is no 146 in the decoded sniff.
For the 100% proof, let me wait for a couple of days to let the problematic stage arrive, and then I'll hook up the oscillator to the ESP's RX pin, to make you a final proof. :)
Thank you!
Hi, all (@esp-zhp).
We have concluded a final test with an O-scope tapped onto the TX pin of RPi (RX pin of ESP32).
Our assumption has been 100% validated now: RPi sends the long gatt write, but ESP32 doesn't acknowledge it.
Please see the shared log: it's a small one!
Facts:
For the last step (3), we have recorded the screen of the scope: this is the proof, that ESP's RX pin is receiving this long write (takes almost 700us), but I assume you will not see this in the debug log from ESP (shared with you too):
This is proved now an ESP-side bug, and it is present in the closed-source library, so we can't do anything else on our side to fix it.
@danergo Thank you for your feedback.
You've demonstrated with an oscilloscope that the host is sending data to the ESP32 controller, which is great. However, there's still a question: the oscilloscope lacks the functionality of a logic analyzer. This means we cannot be 100% certain that the transmitted data is correct unless we use a logic analyzer to parse the UART data from the ESP32's RX pin.
That said, we currently have strong reason to believe that the ESP32 UART is not correctly receiving the data. While this cannot be confirmed with absolute certainty at the moment, it is highly probable that the ESP32 is failing to receive the data properly.
Additionally, I reanalyzed the minicom_newproof.cap
file you provided, which shows all the data received by the ESP32 controller.
I found an anomaly (which is crucial):
Normally, ACL data begins with 0x02
, as per the specification. However, during the issue, the data received by the ESP32 was missing the first 5 bytes. Within the received data, "12 92 00"
indicates a write request to handle 0x92
, which is the data we need!
Continuing with debugging:
(Assuming the host's transmitted data is 100% correct)
The reason the write request wasn't sent out is that the UART failed to receive the data correctly and dropped part of it. Here are some potential points to investigate:
3.1 Has the ESP32 entered sleep mode?
By default, it doesn't unless you've modified the sdkconfig
. If you have, could you share your sdkconfig
file with us?
3.2 Is UART flow control enabled on the ESP32?
Please confirm whether CTS (GPIO 23) and RTS (GPIO 19) are being used. If not, UART issues may arise.
3.3 Add some UART debugging:
When the issue occurs (i.e., when the write request fails to send), print the UART registers. Here's a sample code snippet:
#include "soc/uart_reg.h"
#include "driver/uart.h"
uint32_t int_raw_status = REG_READ(UART_INT_RAW_REG(CONFIG_BTDM_CTRL_HCI_UART_NO));
A potential solution:
The ESP32 controller supports both UART and VHCI methods.
(Top) → Component config → Bluetooth → Controller Options → HCI mode
Currently, the esp-idf/examples/bluetooth/hci/controller_hci_uart_esp32
example uses UART directly instead of VHCI, and it doesn't leverage the IDF's UART driver.
For other BLE and BT examples, we typically use VHCI directly.
Here are the relevant functions:
/* VHCI */
extern bool API_vhci_host_check_send_available(void);
extern void API_vhci_host_send_packet(uint8_t *data, uint16_t len);
extern int API_vhci_host_register_callback(const vhci_host_callback_t *callback);
Currently, the issue lies in the ESP32 controller not receiving data (by uart) from the host correctly.
Therefore, you can transmit HCI data via the VHCI interface instead of using the controller's UART.
Specifically:
VHCI has already undergone extensive testing, so I believe it can address your issue.
This task can be undertaken by either you or us, though it will take some time.
Additionally, I noticed that someone has already started working on this, but I haven’t tested their implementation yet.
Here is the code link:
[ESP-Hosted Network Adapter Bluetooth Slave](https://github.com/espressif/esp-hosted/blob/master/esp_hosted_ng/esp/esp_driver/network_adapter/main/slave_bt.c)
Thank you for extensive description.
I am more than sure that my long write is correct although I have shared a scope screenshot and not a signal analyser (measured shorter gatt writes, and scope showed shorter data flow).
However, we do have signal analyser as well, will provide you that evidence too.
Will also share our sdkconfig, sleep mode (as far as I remember) is enabled.
Will attach it here.
Thank you!
Great! Make sure to print the relevant registers when an issue occurs. Additionally:
Disable sleep mode. Enable UART flow control.
how do you think A potential solution?
Uart flow control is enabled.
Anyway, for VHCI we have just a slight knowledge. How shall we attach to the BlueZ? With btattach? Will vhci present a device in /dev?
Thank you. Will get back to you with details later today.
OMG, my sdkconfig is huge, compared to example's. Please find it attached here.
Many sleep configs are enabled (which might have been modified by us). Will do now 2 things:
1.) Signal-trace the RX pin for the HCI data 2.) Recompile the controller's code with default sdkconfig (or a slightly modified, without sleep).
We have high hopes about this sleep configuration that it can solve this problem :)
Thank you!
Hi, @esp-zhp:
I have some news!
We hooked up our signal tracer onto the RX pin, and see what happened. We measured two cases:
1.) gatt write request with data of 0000000011111111
:
This is great, works OK.
2.) gatt write request with data of 0000000011111111...ffffffff
:
As you can see, RPi is sending only 64bytes of data!
By this time, we thought this must be on RPi (BlueZ or kernel) side.
But we went a little further and reverted back our sdkconfig to the defaults (no sleep enabled now). So, we also had to reflash the ESP, and of course we had to restart BlueZ too. Signal tracer was still hooked onto the RX pin, and by some random idea we did another measurement (now gatt long write is working, as usual after reboot).
But then we have some amazing result:
As you can see, there are two "batches"! First batch is exactly 64bytes long. Delay between two batches is approximately 100us.
Then we pushed even further with a very long gatt write:
No need to say anything. Our SPI to UART component have a 64byte FIFO buffer.
Is is possible that ESP is holding down (or up) the CTS (or RTS?) pin, in order to prevent further data transmission because it must process the current batch?
In full UART communication, which pin is used (driven) by ESP in order to signal the RPi to prevent data transmission? RTS or CTS?
Thank you very much!
Based on your feedback, the current issue seems to be that the Raspberry Pi is not transmitting data correctly, rather than a problem with the ESP32 receiving end. Is that correct?
When hardware and drivers support RTS and CTS, their behavior is automatically managed, and user intervention is typically not required.
If you wish to manually intervene, it should be feasible. Feel free to test it.
However, may I ask why you want to take this approach?
RTS and CTS work together to achieve hardware flow control:
From the logs:
I (2168) CONTROLLER_UART_HCI: HCI UART1 Pin select: TX 5, RX 18, CTS 23, RTS 19 Baudrate: 921600
Pin CTS 23 is used to signal the Raspberry Pi to prevent data transmission.
Thank you.
Based on your feedback, the current issue seems to be that the Raspberry Pi is not transmitting data correctly, rather than a problem with the ESP32 receiving end. Is that correct?
Yes, it seems being on our side, so we wish to say a huge sorry and thank for your kind help through all of this.
We don't want to control the CTS/RTS behavior. We just wanted to check weather it's ESP not accepting more, or RPi can't send more.
uart_set_pin(CONFIG_BT_HCI_UART_NO, 5, 18, 19, 23);
This case we need to trace PIN 23 on ESP to see if it's need to stop us sending, right?
One final test we must do to confirm this is really on the RPi side: once it fails again, we need to signal trace the CTS too: it still can be that in failed case ESP is not pulling the CTS line low fast enough for RPi to continue sending and it halts the transmission (although it's fairly unlikely - BUT! In case we don't reset ESP after failed case, it's not getting back to normal, so still pretty mysterious).
Thank you!
Answers checklist.
IDF version.
Latest
Espressif SoC revision.
NodeMCU-ESP-32S
Operating System used.
Linux
How did you build your project?
Command line with idf.py
If you are using Windows, please specify command line type.
None
Development Kit.
NodeMCU-ESP-32S
Power Supply used.
External 5V
What is the expected behavior?
Stable operation
What is the actual behavior?
Manual power recycle is needed in every 12hrs.
hciconfig hci0 reset is timing out. btattach also times out.
Watchdog enabled but it is not triggering a reset. Coredump enabled but no coredump is being written. Verbose logging also enabled but only few log items are shown.
Steps to reproduce.
Ble ad scanning with hardware filtering (based on device mac and ad) at least 8 devices.
In every 5mins, try connecting to a standard (not ble) devices (which is out of range) - so connection will have to fail always.
Occasionally connect to a ble device (which is in range and shall be succeed).
Every 12 hours (roughly) we have to manually reset the esp. Otherwise hci0 will eventually go down.
Before hci0 going down, we can still try connecting to a ble device but we can't receive longer data from it.
(Ble device asks us an MTU increase, and we accept it, but then we can't receive data: but this happens ONLY after 10-12 hours of constant stressing esp with the above advertising scaninngs and 5mins inactive device connect trials).
I guess some buffer is overfilling but I couldnt enable any practical logging in menuconfig.
What do you suggest?
Debug Logs.
No response
More Information.
No response