Open phaltness opened 4 years ago
Thanks for the detailed report, we will look into.
Hi @phaltness
Sorry for the late response.
Based on the above description, so the fast provisioning client is trying to send vendor messages to the node which is not provisioned by itself?
If my understanding is correct, currently this will cause the error like above. Please apply the patch and have a try. It's just a temporary solution, I will find a better way for the approach.
0001-ble_mesh-remove-dst-check-temporarily.patch.zip
Thanks.
Hi @Campou ! thanks for hot fix solution - some how its working now), but there is no way to ask for provisioned addresses thru the all mesh nodes, i mean primary provisioner knew about nodes in the first wave of provisioning, other nodes, provisioned from fast_prov_nodes - not reporting about his assigned addresses so, my next question is - how to get all node addresses from mesh to primary provisioner? Hope for the answer soon ).
Have a good day.
Hi @phaltness
So you mean get all the node addresses (primary address & element count) from the primary Provisioner?
During fast provisioning, primary Provisioner is responsible for receiving all the node addresses from other fast prov nodes. When a fast prov node provisions a device successfully, it will send the node address to the primary Provisioner.
And currently in the fast provisioning examples, the message ESP_BLE_MESH_VND_MODEL_OP_FAST_PROV_NODE_ADDR_GET
is used by the top Provisioner to get all the primary element addresses of all the nodes from primary Provisioner.
If trying to get the element count, the message format needs to be modified to contain the element count combined with each primary element address.
Thanks.
@phaltness Thanks for reporting, would you please help share if any updates for this issue? Thanks.
Hello, I'm currently facing the same issue. I used the patch provided and everything is working as expected!
I guess it needs a MR.
Thanks!
Hi @MrCurio
Thanks for the confirmation. I will submit a MR for this.
Thanks.
Hi @Campou , I am having another possible related issue that may be related with this one before you submit a MR.
I am unable to receive STATUS messages from nodes not provisioned from the fast provisioning client. I am testing it with an onoff client sending a set message to an onoff server temporary node (not provisioned with the fast prov client).
Messages simply dont arrive on the client's side, but they are certainly being sent correctly after the patch provided here.
Could you please take a look at this?
Thank you!
Hi, I have been looking into the Stack and i think i have found the problem and a temporary solution. I would like someone to review in case it may cause some security problems
It seems that this function forces the Fast Provisioning Client to ignore direct messages(mesh_core/net.c):
static bool ignore_net_msg(u16_t src, u16_t dst)
{
if (IS_ENABLED(CONFIG_BLE_MESH_FAST_PROV)) {
/* When fast provisioning is enabled, the node addr
* message will be sent to the Primary Provisioner,
* which shall not be ignored here.
*/
return false;
}
if (IS_ENABLED(CONFIG_BLE_MESH_PROVISIONER) &&
bt_mesh_is_provisioner_en() &&
BLE_MESH_ADDR_IS_UNICAST(dst) &&
bt_mesh_elem_find(dst)) {
/* If the destination address of the message is the element
* address of Provisioner, but Provisioner fails to find the
* node in its provisioning database, then this message will
* be ignored.
*/
if (!bt_mesh_provisioner_get_node_with_addr(src)) { <--- Here is the problem i am talking about
BT_INFO("Not found node address 0x%04x", src);
return true;
}
}
return false;
}
This function makes complete sense when using a normal provisioning operation since it block messages from other nodes not provisioned by the main provisioner. However, when using Fast Provisioning it may lead to some problems because it blocks communication between the FP Client (Fast Provisioning Client) and other not directly provisioned nodes within the same network.
To overcome this problem, i just simply changed the BT_INFO macro to BT_ERR macro (to do not forget about this issue) and the return value i changed it to "false" instead of "true".
I would appreciate someone to review this thing.
Thank you!
Hi @MrCurio
Thanks for this. Yes, for FP Client, the above approach blocks it from receiving messages from nodes not provisioned by itself. We need to find a better approach for this.
Thanks.
And we add this approach here because:
Thanks.
I update this issue adding some more information probably related to the same issue.
When using the config client to send a ESP_BLE_MESH_MODEL_OP_NODE_RESET
to a non-directly provisioned node (Fast Provisioned) and applying the workaround posted by me above, this message cannot be sent due to a lack of the device key.
I think this is related again to the problem of the main provisioner not having some essential information regarding non-directly provisioned nodes.
I leave some of the errors i am getting:
I (140401) BLE_MESH: Client message 0x00008049 with timeout 4000ms
I (140411) BLE_MESH: send, app_idx 0xfffe src 0x0001 dst 0x0006
I (140421) BLE_MESH: send, len 2: 8049
E (140421) BLE_MESH: DevKey not found
E (140431) BLE_MESH: Failed to send client message 0x00008049
Also related to this problem, but i am not sure how yet. After deleting a directly provisioned node, all messages from its children cannot be received (also applying the workaround).... I am looking into it.
Thanks!
By the way, if this issue gets fixed let me know as soon as possible because it may cause me a lot of further problems. Is there in mind a future release of ble mesh with a bunch of bug fixes?
Thanks
Hi @MrCurio
Sorry for the late reply.
For fast provisioning, temporarily removing the checks for sending and receiving messages for FP client could only resolve part of the problem, i.e. only makes the FP client sending and receiving messages from the nodes not provisioned by itself works, but the messages could only use AppKey + NetKey.
If any messages needs the DevKey + NetKey (currently only configuration messages), this would cause the failure like above, this is because the nodes are not provisioned by the FP client, thus no device information (especially DevKey) could be found for sending messages.
In order to completely resolve this problem, I may introduce the following changes:
Please help to take a look at the above changes, these should be able to completely resolve this problem. And firstly I will provide a patch for this ASAP.
Thanks.
Hi @Campou
Thanks for the reply. Bluetooth Mesh issues are critical in my application, so i tried to be as updated as possible.
The changes you recommend sound good to me and they will probably fix this issue. Looking forward to applying your patch :+1:
In fact, if you don't mind, could you please solve me this doubt?
esp_ble_mesh_provisioner_prov_disable
) affects using its own Net and App Keys?
Quite often and randomly (more often when timeout messages), i get the following error: Invalid NetKeyIndex 0x0000
. While debugging i found out the problem occurs in mesh_core/access.c line 1134
:struct bt_mesh_subnet *bt_mesh_tx_netkey_get(u8_t role, u16_t net_idx)
{
struct bt_mesh_subnet *sub = NULL;
if (IS_ENABLED(CONFIG_BLE_MESH_NODE) && bt_mesh_is_provisioned() && role == NODE) {
sub = bt_mesh_subnet_get(net_idx);
} else if (IS_ENABLED(CONFIG_BLE_MESH_PROVISIONER) && bt_mesh_is_provisioner_en() && role == PROVISIONER) {
sub = bt_mesh_provisioner_subnet_get(net_idx);
} else if (IS_ENABLED(CONFIG_BLE_MESH_FAST_PROV) && bt_mesh_is_provisioned() && role == FAST_PROV) {
sub = bt_mesh_fast_prov_subnet_get(net_idx);
}
return sub;
}
From what i see in this function, seems that for the provisioner to get the NetKey, the provisioning should be enabled (bt_mesh_is_provisioner_en
).
My application fast provisions all the nodes and once the net is formed, it disables the provisioning in every node including itself. However, after disabling main provisioner's bearers, the "Invalid Netkey Index" error occurs more often. By simply not disabling the provisioning process, the error i mentioned does not happen anymore
Does this make any sense?
As i said this issues are critical for my application and i am really interested in having a direct contact for ble-mesh updates and issues besides posting an issue on here and awaiting for response. My email is: mrcuriogit@gmail.com, please send me an email . I would be so grateful if you agree to help me with further issues in my application.
Btw, If you prefer i could post this message in other issue Thanks!
@MrCurio Thanks for sharing the updates, please help write to sales@espressif.com, thanks.
Hi @MrCurio
Please have a try with this patch 0001-ble_mesh-stack-Add-API-to-store-fast-provisioning-no.zip.
Note: the patch is generated based on the idf v4.2 release.
In this patch, I have added the functionalities as described above.
esp_ble_mesh_provisioner_store_fast_prov_node_info
with the parameter unicast_addr
set to the unicast address of the node, and the parameter element_num
sets to 1 (at this moment, the exact element_num
is not known by Top Provisioner);example_send_fast_prov_node_info_get
to get the node's information, i.e. element count, device uuid, and device key;esp_ble_mesh_provisioner_store_fast_prov_node_info
again to update the node's information.And with all the above procedures performed, now Top Provisioner could send Configuration messages to the node.
Top Provisioner is a Provisioner actually, and the Provisioner functionality shall not be disabled. For other nodes, the Provisioner functionality is a temporary one, when the fast provisioning is disabled, the Provisioner functionality will also be disabled in the mesh stack.
Thanks.
Thank you very much for the patch. I will try it next week and update this issue with the results :+1: !
Hi @Campou I have tried this week the patch you have provided and i found some issues solving the initial problem. I don't know yet if its on my side or on yours, but i explain below.
I tried the patch in small networks 2-4 nodes and everything seemed to work as expected (storing node info with empty uuid and once the node info is received, update the uuid and device Key). However, today i tried to implement larger networks around 6-7 nodes and i have some issues related to the patch:
I (53102) FAST_PROV_CLIENT: The stored address is :0006
I (53222) FAST_PROV_CLIENT: The stored address is :0008
I (53242) FAST_PROV_CLIENT: The stored address is :0007
W (53382) BLE_MESH: Node already exists, uuid 00000000000000000000000000000000
E (53382) BLE_MESH: Failed to find DST 0x0008
E (53382) BLE_MESH: Not ready to send
E (53392) BLE_MESH: Failed to send client message 0x00ce02e5
W (53392) BLE_MESH: Node already exists, uuid 00000000000000000000000000000000
E (53402) BLE_MESH: Failed to find DST 0x0007
E (53402) BLE_MESH: Not ready to send
E (53412) BLE_MESH: Failed to send client message 0x00ce02e5
W (53422) Model: ESP_BLE_MESH_PROVISIONER_STORE_FAST_PROV_NODE_INFO_COMP_EVT, err_code 0, node 0x0006
W (53432) Model: ESP_BLE_MESH_PROVISIONER_STORE_FAST_PROV_NODE_INFO_COMP_EVT, err_code -17, node 0x0008
E (53442) Model: Failed to send message 0xce02e5
W (53442) Model: ESP_BLE_MESH_PROVISIONER_STORE_FAST_PROV_NODE_INFO_COMP_EVT, err_code -17, node 0x0007
E (53452) Model: Failed to send message 0xce02e5
W (54132) Model: RECEIVED INFO TTL: 7 and Addr: 0006
W (54142) Model: ESP_BLE_MESH_VND_MODEL_OP_FAST_PROV_NODE_INFO_STATUS
I (54142) FAST_PROV_CLIENT: Get FP node 0x0006 information, elem_num 1
I (54142) Device UUID: 32 10 10 52 1c 82 3d 4e 00 00 00 00 00 00 00 00
I (54152) Device Key: bf 57 e7 f2 88 5e 9f bf 2a af 70 0d 3e a4 ef 26
W (54202) Model: ESP_BLE_MESH_PROVISIONER_STORE_FAST_PROV_NODE_INFO_COMP_EVT, err_code 0, node 0x0006
In the logs above, it can be observed that i parse the addresses received and i save them with empty uuid and device Key. Once the function esp_ble_mesh_provisioner_store_fast_prov_node_info
has returned, i proceed to ask for the real uuid and device Key, but it crashes before.
This may happen because i try to ask for the node info before the ESP_BLE_MESH_PROVISIONER_STORE_FAST_PROV_NODE_INFO_COMP_EVT
event . However, after several tests, i can conclude this is not the problem.
Right now i am a bit lost, it seems that the node information is not stored succesfully because it has been already stored. However, i just call the esp_ble_mesh_provisioner_store_fast_prov_node_info
funtion once.
I leave below the part of my code where i perform these operations:
for (int i = 0; i < length; i = i + 2)
{
uint16_t node_addr = 0;
memcpy(&node_addr, data, 2);
data = data + 2;
if (ESP_BLE_MESH_ADDR_IS_UNICAST(node_addr)){
ESP_LOGI(TAG, "The stored address is :%04x", node_addr);
err = store_node_info(NULL, node_addr, 1, 0xFFFF, 0xFFFF);
if (err != ESP_OK)
{
ESP_LOGE(TAG, "Error: Cannot store address :%04x %s:%d", node_addr, __func__, __LINE__);
return ESP_FAIL;
}
err = esp_ble_mesh_provisioner_store_fast_prov_node_info(node_addr, 1, NULL, NULL);
if (err != ESP_OK)
{
ESP_LOGE(TAG, "Error: Cannot store address :%04x %s:%d", node_addr, __func__, __LINE__);
return ESP_FAIL;
}
example_node_info_t *node = get_node_info(node_addr);
if (!node)
{
ESP_LOGE(TAG, "Error: Cannot find address :%04x %s:%d", node_addr, __func__, __LINE__);
return ESP_FAIL;
}
get_node_net_info(node_addr); -------------> Here i send the message to get Dev Key and UUID
}
}
Thank you!
Hi @MrCurio
Thanks for the update.
Updated the patch ble_mesh-stack-Add-API-to-store-fast-provisioning-no-update.zip, in the patch, if the device uuid is zero when adding the node information, then ignore the device uuid check.
Thanks.
Hi @Campou Thank you for the updated patch! I will continue testing the patch over the week and if no further issues appear, i will proceed to close this issue.
Thanks! :+1:
Hello @Campou I have fully tested the patch and no other issues appeared with regards to software. However, i have noticed that this last message to obtain node's UUID and Device Key becomes a little problematic when developing extended networks.
I am working with a 5-8 hop network where nodes are distanced 10 meters max with each other. When this message is sent from the furthest node to the Provisioner (let's say 7-8 hops), the volume of the payload forces the process of segmentation over 7-8 hops and reassembly at the destination, which causes constant timeouts even if set timeout to 5-6 seconds, which is unacceptable in my application.
I have run some tests aside fast provisioning and i got some shocking results comparing latency vs hops vs payload. Apparently, when 3-4 segmentation processes occur over more than 4-5 hops, latency dramatically increases to almost 3-4 or more seconds. Isn't it a little too much? Any recommendation?
I have seen the results posted by Silicon Labs about Bluetooth Mesh perfomance and they have much lower latencies. Is there any reason for this besides the use of an external antenna? Would it really improve my network perfomance if i use an external antenna?
Thanks
MrCurio
Hi, is there a reason the patch has not been merged after so long?
Hi @Campou @MrCurio, i realised the patch is incomplete as it does not store configuration in persistent storage after reset for the nodes. Only the main provisioner can retain its data after reset.
I guess this is the problem, found this on the documentation page
Environment
Problem Description
running modifying fast_provisioner examples with custom models messaging over all nodes (relay support is enabled by default): 1 - message sending to nodes of 1-st wave of provisioning is OK 2 - unicast messages to tne nodes of 2-nd wave of provisioning - is not accessible - error in log : BLE_MESH: ready_to_send, Failed to find DST 0x0015 3 - at the same time broadcast messaging (DST 0xFFFF) is working OK
Expected Behavior
CLIENT: send vendor message to Unicast address 0x0015 CLIENT: ESP_BLE_MESH_MODEL_SEND_COMP_EVT, err_code 0
Actual Behavior
I (91464) CLIENT: send vendor message to Unicast address 0x0015 E (91464) BLE_MESH: ready_to_send, Failed to- find DST 0x0015 E (91474) BLE_MESH: model_send, fail I (91474) CLIENT: ESP_BLE_MESH_MODEL_SEND_COMP_EVT, err_code -22
Steps to reproduce
Code to reproduce this issue
patches of our repo (forked from last master (1 June 2020)) will be prepared;
Debug log goes here, should contain the backtrace, as well as the reset source if it is a crash. Please copy the plain text here for us to search the error log. Or attach the complete logs but leave the main part here if the log is too long.