Open Lukas-H-97 opened 10 months ago
@Lukas-H-97 Can you capture the mesh packets? The duplicated packet received may be a packet retransmitted by the sender.
Sure, do you want data from mesh_addr_t & mesh_data_t? Is it anything more specifik you need?
I mean sniffer the wireless packets using the wireshark.
I dont have a network card that supports monitoring mode. So I will not be able to sniff the packets. Is there any other way we could go about the issue? I will buy a network adapter but before that any help would be greatly appreciated
Here you have capture of the traffic
Here you have capture of the traffic
@Lukas-H-97 There are only beacon frames in the capture, I can't find the wifi mesh data.
Ok i will check that. Can you run some modules yourself to confirm that you get the same issue? Im running 9 modules when testing.
OK, I will try to reproduce the issue on my side.
Hello!
@zhangyanjiaoesp Have you been able to test yet?
No, I still can't reproduce this issue.
Can you show the code you use, what modules and a picture of how your modules are placed when you are testing?
The test environment is no longer there because I'm doing something else. You still can't capture the mesh packets?
After the following log the issue appears.
W (45806) wifi:5258
I (45836) wifi:new:<1,0>, old:<1,1>, ap:<1,1>, sta:<1,1>, prof:1
I (45856) mesh: [wifi]disconnected reason:8(assoc leave), continuous:1/max:12, non-root, vote(,)<>
After the following log the issue appears.
W (45806) wifi:5258parent candidate 84:f7:03:cc:51:95, candidate_set:1, rssi:-27(threshold:-78), duration:16secs I (45806) mesh: 989[switch]choose candidate:84:f7:03:cc:51:95<layer:2, rssi:-27, assoc:0>, parent:84:f7:03:cc:78:b1<layer:2, rssi:-32, assoc:2> I (45816) mesh: 995[SWITCH]connect to candidate:ESPM_CC5194, rssi:-27, 84:f7:03:cc:51:95[layer:2, assoc:0] I (45826) wifi:state: run -> init (0) I (45826) wifi:pm stop, total sleep time: 0 us / 32126317 us
I (45836) wifi🆕<1,0>, old:<1,1>, ap:<1,1>, sta:<1,1>, prof:1 I (45856) mesh: [wifi]disconnected reason:8(assoc leave), continuous:1/max:12, non-root, vote(,)<> I (46826) wifi🆕<1,1>, old:<1,0>, ap:<1,1>, sta:<1,1>, prof:1 I (46826) wifi:state: init -> auth (b0) I (46836) wifi:state: auth -> assoc (0) I (46846) wifi:state: assoc -> run (10) I (46846) mesh:
<><><><><><> I (46846) mesh: from assoc, layer:3, root_addr:24:4c🆎00:8b:6d, root_cap:10 I (46856) mesh: idle, layer:3, root_addr:24:4c🆎00:8b:6d, conflict_roots.num:0<> I (46856) wifi:connected with ESPM_CC5194, aid = 1, channel 1, 40U, bssid = 84:f7:03:cc:51:95 I (46876) wifi:security: WPA2-PSK, phy: bgn, rssi: -29 I (46876) wifi:pm start, type: 0
This log show the mesh node is switching to a better parent. If you can't capture mesh packets, please provide your IDF commit id, we will try to add some debug log for this issue.
Any news on this?
I am running into the same issue (using ESP-IDF 5.1): after a while packets in the mesh are duplicated if they are passed through intermediate nodes. I don't have a reduced test case that I can easily share, but when monitoring the traffic from another member of the mesh using the promiscuous mode, I see traffic like this for packet being sent from root to layer 4:
Maybe when reconnecting a node to a parent, it gets added to some forward list again and then receives the same packet multiple times?
Here's some data extracted from our logs:
time | dur | seq | wifi-mac1 | wifi-mac2 | wifi-mac3 | mesh-dest-mac | mesh-src-mac | mesh-data |
---|---|---|---|---|---|---|---|---|
9.782 | 2c00 | 20e1 | 84:cc:a8:00:53:88 | 94:3c:c6:88:52:79 | 94:3c:c6:88:52:79 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | cc320400ffffff0f |
9.799 | 2c00 | 6016 | 34:85:18:98:14:34 | 84:cc:a8:00:53:89 | 84:cc:a8:00:53:89 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 7e2b0f00ffffff0f |
9.871 | 2c00 | 7016 | 34:85:18:98:14:34 | 84:cc:a8:00:53:89 | 84:cc:a8:00:53:89 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 7e2b0f00ffffff0f |
9.937 | 2c00 | 8016 | 34:85:18:98:14:34 | 84:cc:a8:00:53:89 | 84:cc:a8:00:53:89 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 7e2b0f00ffffff0f |
10.117 | 2c00 | 9016 | 34:85:18:98:14:34 | 84:cc:a8:00:53:89 | 84:cc:a8:00:53:89 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 7e2b0f00ffffff0f |
10.137 | 2c00 | a016 | 34:85:18:98:14:34 | 84:cc:a8:00:53:89 | 84:cc:a8:00:53:89 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 7e2b0f00ffffff0f |
10.157 | 2c00 | b016 | 34:85:18:98:14:34 | 84:cc:a8:00:53:89 | 84:cc:a8:00:53:89 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 7e2b0f00ffffff0f |
10.206 | 2c00 | 80fa | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 18f91600ffffff0f |
10.252 | 2c00 | 90fa | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 18f91600ffffff0f |
10.273 | 2c00 | a0fa | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 18f91600ffffff0f |
10.334 | 2c00 | b0fa | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 18f91600ffffff0f |
10.337 | 2c00 | c0fa | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 18f91600ffffff0f |
10.341 | 2c00 | d0fa | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 19f91600ffffff0f |
10.345 | 2c00 | e0fa | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 19f91600ffffff0f |
10.349 | 2c00 | f0fa | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 19f91600ffffff0f |
10.353 | 2c00 | 00fb | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 19f91600ffffff0f |
10.356 | 2c00 | 10fb | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 19f91600ffffff0f |
10.360 | 2c00 | 20fb | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1af91600ffffff0f |
10.363 | 2c00 | 30fb | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1af91600ffffff0f |
10.367 | 2c00 | 40fb | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1af91600ffffff0f |
10.370 | 2c00 | 50fb | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1af91600ffffff0f |
10.373 | 2c00 | 60fb | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1af91600ffffff0f |
10.377 | 2c00 | 70fb | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1bf91600ffffff0f |
10.429 | 2c00 | 80fb | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1bf91600ffffff0f |
10.432 | 2c00 | 90fb | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1bf91600ffffff0f |
10.434 | 2c00 | a0fb | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1bf91600ffffff0f |
10.436 | 2c00 | b0fb | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1bf91600ffffff0f |
10.439 | 2c00 | c0fb | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1cf91600ffffff0f |
10.441 | 2c00 | d0fb | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1cf91600ffffff0f |
10.443 | 2c00 | e0fb | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1cf91600ffffff0f |
10.446 | 2c00 | f0fb | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1cf91600ffffff0f |
10.449 | 2c00 | 00fc | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1cf91600ffffff0f |
10.452 | 2c00 | 10fc | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1df91600ffffff0f |
10.454 | 2c00 | 20fc | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1df91600ffffff0f |
10.456 | 2c00 | 30fc | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1df91600ffffff0f |
10.459 | 2c00 | 40fc | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1df91600ffffff0f |
10.517 | 2c00 | 50fc | e8:db:84:02:de:00 | 34:85:18:98:14:35 | 34:85:18:98:14:35 | e8:db:84:02:de:00 | 94:3c:c6:88:52:78 | 1df91600ffffff0f |
Macs: | Root | Layer2 | Layer3 | Layer4 |
---|---|---|---|---|
94:3c:c6:88:52:78 | 84:cc:a8:00:53:88 | 34:85:18:98:14:34 | e8:db:84:02:de:00 |
Hello @rainers
I really don´t know why there is duplicated messages yet. But it ould be something to do with mesh stack retransmission. In my case when sending from ROOT to NODE duplicated messages appears. Sending from NODE to ROOT I have not seen duplicated messages.
Here is some code for sending without mesh stack retransmission.
--- ROOT --> NODE ---
esp_err_t mesh_send_device(DEVICE_t * device){
mesh_data_t data;
mesh_addr_t addr;
data.data = (uint8_t *)device;
data.size = sizeof(DEVICE_t);
data.proto = MESH_PROTO_BIN;
data.tos = MESH_TOS_DEF; // Disable mesh retransmission
memcpy(addr.addr, device->child_mac, 6);
if(esp_mesh_send(&addr, &data, MESH_DATA_P2P, NULL, 0) == ESP_OK)
return ESP_OK;
return ESP_FAIL;
}
esp_mesh_send parameters
--- NODE --> ROOT ---
esp_err_t mesh_send_device(DEVICE_t * packet){
mesh_data_t data_out;
data_out.data = (uint8_t *)&device;
data_out.size = sizeof(DEVICE_t);
data_out.proto = MESH_PROTO_BIN;
data_out.tos = MESH_TOS_P2P; // When setting this to "MESH_TOS_DEF" it throws "ESP_ERR_MESH_ARGUMENT"
esp_err_t err = esp_mesh_send(NULL, &data_out, 0, NULL, 0);
ESP_LOGW(TAG, "%s", esp_err_to_name(err));
return err;
}
esp_mesh_send parameters
It seems to work better when turning off mesh stack retransmission but I have not been doing much testing. Instead im sending from ROOT -> NODE less, in my case thats OK. I also added shutdown function so that all NODES and ROOT reboots at the same time. Rebooting everything seems to flush the retransmission isssue away.
Please let me know if you find anything else :)
Thanks @Lukas-H-97 for the hints and examples. Maybe I skipped over it, but forgot about MESH_TOS_DEF
. I'm trying that now, but it'll take some time before getting results. When using that mode from a node I get error ESP_ERR_MESH_ARGUMENT from esp_mesh_send(), but no crashes.
I can confirm that I have not seen the issue when sending from node to root. Unfortunately, I cannot easily predict whether packets sent from one node to another will travel downwards at some point in the tree (if that's what is causing the duplication).
We also considered rebooting the mesh when duplicate messages are detected as a workaround, but that doesn't scale very well for larger meshes that probably make the issue appear more often.
@rainers yes you are correct! It will only throw ESP_ERR_MESH_ARGUMENT it won't crash the module, my bad.
After about 3 days my test mesh with 13 nodes still runs smoothly without any notable duplicates (though only observed indirectly). Most packets in our mesh are standard network traffic, so if the MESH_TOS_DEF-mode implies occasional missing retransmissions, the TCP layer can deal with that.
Thanks again for the workaround.
@zhangyanjiaoesp Could we get some response in this?
I really don´t know why there is duplicated messages yet. But it would be something to do with mesh stack retransmission. In my case when sending from ROOT to NODE duplicated messages appears. Sending from NODE to ROOT I have not seen duplicated messages.
Here is some code for sending without mesh stack retransmission.
--- ROOT --> NODE ---
esp_err_t mesh_send_device(DEVICE_t * device){ mesh_data_t data; mesh_addr_t addr; data.data = (uint8_t *)device; data.size = sizeof(DEVICE_t); data.proto = MESH_PROTO_BIN; data.tos = MESH_TOS_DEF; // Disable mesh retransmission memcpy(addr.addr, device->child_mac, 6); if(esp_mesh_send(&addr, &data, MESH_DATA_P2P, NULL, 0) == ESP_OK) return ESP_OK; return ESP_FAIL; }
esp_mesh_send parameters
1. to [in]................... mesh_addr_t with NODE mac 2. data [in].............. My data 3. flag [in]............... FROM ESP DOCS [If the packet is to an internal device, MESH_DATA_P2P should be set.] 4. opt [in]................ Not used set to NULL 5. opt_count [in]... Not used set to 0
--- NODE --> ROOT ---
esp_err_t mesh_send_device(DEVICE_t * packet){ mesh_data_t data_out; data_out.data = (uint8_t *)&device; data_out.size = sizeof(DEVICE_t); data_out.proto = MESH_PROTO_BIN; data_out.tos = MESH_TOS_P2P; // When setting this to "MESH_TOS_DEF" it throws "ESP_ERR_MESH_ARGUMENT" esp_err_t err = esp_mesh_send(NULL, &data_out, 0, NULL, 0); ESP_LOGW(TAG, "%s", esp_err_to_name(err)); return err; }
esp_mesh_send parameters
1. to [in]................... FROM ESP DOCS [If the packet is to the root, set this parameter to NULL.] 2. data [in].............. My data 3. flag [in]............... FROM ESP DOCS [If the packet is to the root and "to" parameter is NULL, set this parameter to 0.] 4. opt [in]................ Not used set to NULL 5. opt_count [in]... Not used set to 0
It seems to work better when turning off mesh stack retransmission but I have not been doing much testing. Instead im sending from ROOT -> NODE less, in my case thats OK. I also added shutdown function so that all NODES and ROOT reboots at the same time. Rebooting everything seems to flush the retransmission isssue away.
Please let me know if you find anything else :)
@Lukas-H-97 this workaround seems good. I didn't reproduce the problem on my side. I can provide debug wifi lib based on your IDF version, please tell me the IDF commit you are using.
Answers checklist.
General issue report
Hi,
I been having some trouble with duplicakted package when using wifi-mesh.
Code used Mesh internal communication example.
When does problem occur: After running the mesh for a couple of hour.
What happens After a couple of hours there will be duplicated package and after more time there could be more then 1 duplicated package. It seems that the function "esp_mesh_recv" causes the issue.
Code to catch the issue I added code to messure the ticks between messages. Here is the "esp_mesh_p2p_rx_main" code.
void esp_mesh_p2p_rx_main(void *arg) { int recv_count = 0; esp_err_t err; mesh_addr_t from; int send_count = 0; mesh_data_t data; int flag = 0; data.data = rx_buf; data.size = RX_SIZE; is_running = true;
}
Logg [0;33mW (61093395) mesh_main: [#RX:97984/59591][L:3][TICK:102][TICKBUGG:1] parent:7c:df:a1:54:3f:35, receive from 68:67:25:2e:51:62, size:1460, heap:61196, flag:0[err:0x0, proto:0, tos:0][0m
[0;33mW (61093405) mesh_main: [#RX:97985/59591][L:3][TICK:1][TICKBUGG:1] parent:7c:df:a1:54:3f:35, receive from 68:67:25:2e:51:62, size:1460, heap:61196, flag:0[err:0x0, proto:0, tos:0][0m
[0;33mW (61094415) mesh_main: [#RX:97986/59592][L:3][TICK:101][TICKBUGG:1] parent:7c:df:a1:54:3f:35, receive from 68:67:25:2e:51:62, size:1460, heap:61196, flag:0[err:0x0, proto:0, tos:0][0m
[0;33mW (61094425) mesh_main: [#RX:97987/59592][L:3][TICK:1][TICKBUGG:1] parent:7c:df:a1:54:3f:35, receive from 68:67:25:2e:51:62, size:1460, heap:61196, flag:0[err:0x0, proto:0, tos:0][0m
[0;33mW (61095445) mesh_main: [#RX:97988/59593][L:3][TICK:102][TICKBUGG:1] parent:7c:df:a1:54:3f:35, receive from 68:67:25:2e:51:62, size:1460, heap:61196, flag:0[err:0x0, proto:0, tos:0][0m
[0;33mW (61095455) mesh_main: [#RX:97989/59593][L:3][TICK:1][TICKBUGG:1] parent:7c:df:a1:54:3f:35, receive from 68:67:25:2e:51:62, size:1460, heap:61196, flag:0[err:0x0, proto:0, tos:0][0m
[0;33mW (61096475) mesh_main: [#RX:97990/59594][L:3][TICK:102][TICKBUGG:1] parent:7c:df:a1:54:3f:35, receive from 68:67:25:2e:51:62, size:1460, heap:61196, flag:0[err:0x0, proto:0, tos:0][0m
[0;33mW (61096475) mesh_main: [#RX:97991/59594][L:3][TICK:0][TICKBUGG:1] parent:7c:df:a1:54:3f:35, receive from 68:67:25:2e:51:62, size:1460, heap:61196, flag:0[err:0x0, proto:0, tos:0][0m
Kind regards, Lukas