espressif / esp-matter

Espressif's SDK for Matter
Apache License 2.0
648 stars 151 forks source link

SRP update error via OpenThread (CON-1307) #1056

Open hatl opened 1 month ago

hatl commented 1 month ago

Using the existing examples, I wrote some basic code to expose a switch (onoff plugin) and buttons - see: https://github.com/hatl/esp-matter-code When commissioning the device via WiFi, everything works as expected and the device shows up in Home Assistant. Then, I switched to OpenThread - according to https://docs.espressif.com/projects/esp-matter/en/latest/esp32c6/developing.html#flashing-the-firmware

To enable Thread, you should change the menuconfig options to CONFIG_OPENTHREAD_ENABLED=y, CONFIG_ENABLE_WIFI_STATION=n, and CONFIG_USE_MINIMAL_MDNS=n.

When using Thread, the error E (34252) chip[DL]: SRP update error: internal server error as mentioned below keeps repeating.

Environment

Any additional details

Log for OpenThread commissioning:

I (1472) chip[ZCL]: Initiating Admin Commissioning cluster.
I (1472) chip[DIS]: Updating services using commissioning mode 1
E (1482) chip[DIS]: Failed to remove advertised services: 3
I (1492) chip[DIS]: Advertise commission parameter vendorID=65522 productID=32769 discriminator=2026/07 cm=1 cp=0
E (1502) chip[DIS]: Failed to advertise commissionable node: 3
E (1502) chip[DIS]: Failed to finalize service update: 3
I (1512) chip[IN]: CASE Server enabling CASE session setups
I (1512) chip[SVR]: Joining Multicast groups
I (1522) chip[SVR]: Server Listening...```
I (1532) esp_matter_core: Dynamic endpoint 0 added
I (1532) esp_matter_attribute: ********** W : Endpoint 0x0001's Cluster 0x00000003's Attribute 0x00000001 is 0 **********
I (1542) esp_matter_core: Dynamic endpoint 1 added
I (1552) esp_matter_attribute: ********** W : Endpoint 0x0002's Cluster 0x00000003's Attribute 0x00000001 is 0 **********
I (1562) esp_matter_core: Dynamic endpoint 2 added
I (1562) esp_matter_attribute: ********** W : Endpoint 0x0003's Cluster 0x00000003's Attribute 0x00000001 is 0 **********
I (1572) esp_matter_attribute: ********** W : Endpoint 0x0003's Cluster 0x00000004's Attribute 0x00000000 is 128 **********
I (1592) esp_matter_attribute: ********** W : Endpoint 0x0003's Cluster 0x00000004's Attribute 0x0000FFFC is <invalid type: 0> **********
I (1602) chip[ZCL]: 0x42156e68 ep 3 clus 0x0000_0062 attr 0x0000_0000 not supported
I (1612) esp_matter_attribute: ********** R : Endpoint 0x0003's Cluster 0x00000006's Attribute 0x0000FFFC is 1 **********
I (1622) esp_matter_attribute: ********** R : Endpoint 0x0003's Cluster 0x00000006's Attribute 0x00004003 is 1 **********
I (1632) esp_matter_attribute: ********** R : Endpoint 0x0003's Cluster 0x00000006's Attribute 0x00000000 is 1 **********
I (1642) esp_matter_attribute: ********** R : Endpoint 0x0003's Cluster 0x00000006's Attribute 0x00000000 is 1 **********
I (1652) chip[ZCL]: Endpoint 3 On/off already set to new value
I (1662) esp_matter_core: Dynamic endpoint 3 added
I (1672) esp_matter_core: Cannot find minimum unused endpoint_id, try to find in the previous namespace
I (1682) esp_matter_core: Failed to open node namespace
I (1672) chip[DL]: Configuring CHIPoBLE advertising (interval 25 ms, connectable)
I (1702) NimBLE: GAP procedure initiated: advertise; 
I (1712) NimBLE: disc_mode=2
I (1712) NimBLE:  adv_channel_map=0 own_addr_type=1 adv_filter_policy=0 adv_itvl_min=40 adv_itvl_max=40
I (1722) NimBLE: 

I (1732) chip[DL]: CHIPoBLE advertising started
I (1732) app_main: Commissioning window opened

Repeating errror:

I(31152) OPENTHREAD:[I] SrpClient-----: Received response
I(31172) OPENTHREAD:[I] SrpClient-----: Server rejected Failed code:2
I(31172) OPENTHREAD:[I] SrpClient-----: HostInfo Adding -> ToAdd
I(31182) OPENTHREAD:[I] SrpClient-----: Retry interval 3060 ms
I(31182) OPENTHREAD:[I] SrpClient-----: State Updating -> ToRetry
E (31192) chip[DL]: SRP update error: internal server error
I(33582) OPENTHREAD:[I] Settings------: Read SrpEcdsaKey
I(33602) OPENTHREAD:[I] SrpClient-----: Send update
I(33602) OPENTHREAD:[I] SrpClient-----: HostInfo ToAdd -> Adding
I(33602) OPENTHREAD:[I] SrpClient-----: State ToRetry -> Updating
I(33652) OPENTHREAD:[I] Mac-----------: Frame tx attempt 1/16 failed, error:NoAck, len:122, seqnum:252, type:Data, src:0x2c32, dst:0x2c00, sec:yes, ackreq:yes
I(33712) OPENTHREAD:[I] Mac-----------: Frame tx attempt 1/16 failed, error:NoAck, len:122, seqnum:253, type:Data, src:0x2c32, dst:0x2c00, sec:yes, ackreq:yes
I(33742) OPENTHREAD:[I] Mac-----------: Frame tx attempt 2/16 failed, error:NoAck, len:122, seqnum:253, type:Data, src:0x2c32, dst:0x2c00, sec:yes, ackreq:yes
I(33802) OPENTHREAD:[I] Mac-----------: Frame tx attempt 1/16 failed, error:NoAck, len:122, seqnum:254, type:Data, src:0x2c32, dst:0x2c00, sec:yes, ackreq:yes
I(33892) OPENTHREAD:[I] MeshForwarder-: Sent IPv6 UDP msg, len:726, chksum:f0d9, ecn:no, to:0x2c00, sec:yes, prio:normal
I(33892) OPENTHREAD:[I] MeshForwarder-:     src:[...]:49154
I(33902) OPENTHREAD:[I] MeshForwarder-:     dst:[...]:53538
I(34192) OPENTHREAD:[I] MeshForwarder-: Received IPv6 UDP msg, len:60, chksum:fd7c, ecn:no, from:0x2c00, sec:yes, prio:normal, rss:-78.0
I(34202) OPENTHREAD:[I] MeshForwarder-:     src:[...]:53538
I(34202) OPENTHREAD:[I] MeshForwarder-:     dst:[...]:49154
I(34222) OPENTHREAD:[I] SrpClient-----: Received response
I(34232) OPENTHREAD:[I] SrpClient-----: Server rejected Failed code:2
I(34232) OPENTHREAD:[I] SrpClient-----: HostInfo Adding -> ToAdd
I(34242) OPENTHREAD:[I] SrpClient-----: Retry interval 5 sec
I(34242) OPENTHREAD:[I] SrpClient-----: State Updating -> ToRetry
E (34252) chip[DL]: SRP update error: internal server error

During WiFi commissioning, I get some errors, but it works:

E (25150) chip[DMG]: Error retrieving data from clusterId: 0x0000_0028, err = b
E (25260) chip[DMG]: Error retrieving data from clusterId: 0x0000_003C, err = b
E (25370) chip[DMG]: Error retrieving data from clusterId: 0x0000_003E, err = b
E (25500) chip[DMG]: Error retrieving data from clusterId: 0x0000_0036, err = b
E (25740) chip[DMG]: Error retrieving data from clusterId: 0x0000_0040, err = b
E (25960) chip[DMG]: Error retrieving data from clusterId: 0x0000_0040, err = b
E (26230) chip[DMG]: Error retrieving data from clusterId: 0x0000_0006, err = b
E (26930) chip[DMG]: Error retrieving data from clusterId: 0x0000_0028, err = b
E (27030) chip[DMG]: Error retrieving data from clusterId: 0x0000_0033, err = b
E (27120) chip[DMG]: Error retrieving data from clusterId: 0x0000_003E, err = b
E (27240) chip[DMG]: Error retrieving data from clusterId: 0x0000_0036, err = b
E (27480) chip[DMG]: Error retrieving data from clusterId: 0x0000_0040, err = b
E (27690) chip[DMG]: Error retrieving data from clusterId: 0x0000_0040, err = b
E (27960) chip[DMG]: Error retrieving data from clusterId: 0x0000_0006, err = b
hatl commented 3 weeks ago

It's strange. On very rare occasions, the pairing works...

hatl commented 3 weeks ago

I tried to run the same code on an ESP32-H2. Unfortunately, this also fails.

I'm also getting the following errors:

I (1541) chip[ZCL]: Initiating Admin Commissioning cluster. 
I (1551) chip[DIS]: Updating services using commissioning mode 1
E (1551) chip[DIS]: Failed to remove advertised services: 3 
I (1561) chip[DIS]: Advertise commission parameter vendorID=65521 productID=32768 discriminator=0164/00 cm=1 cp=0
E (1571) chip[DIS]: Failed to advertise commissionable node: 3
E (1571) chip[DIS]: Failed to finalize service update: 3
...
I (94920) chip[SVR]: Commissioning completed successfully
...
E (98700) chip[DL]: SRP update error: internal server error
hatl commented 3 weeks ago

i could narrow it down further: the commissioning works when using the test device attestation. when i switch the config to factory attestation, i get the errors as described.

i've changed the following parameters:

CONFIG_ENABLE_ESP32_FACTORY_DATA_PROVIDER=y
CONFIG_ENABLE_ESP32_DEVICE_INSTANCE_INFO_PROVIDER=y
CONFIG_SEC_CERT_DAC_PROVIDER=y
CONFIG_FACTORY_DEVICE_INSTANCE_INFO_PROVIDER=y

i generate the factory and cert images using the following command:

esp-matter-mfg-tool -cn "My OnOff Plug" -v 0xFFF2 -p 0x8001 --pai \
--vendor-name "vendor" --product-name "My OnOff Plug" --hw-ver 1 --hw-ver-str v1.0 --mfg-date 2024-08-09 \
-k $MATTER_SDK_PATH/credentials/test/attestation/Chip-Test-PAI-FFF2-8001-Key.pem \
-c $MATTER_SDK_PATH/credentials/test/attestation/Chip-Test-PAI-FFF2-8001-Cert.pem \
-cd $MATTER_SDK_PATH/credentials/test/certification-declaration/Chip-Test-CD-FFF2-8001.der \
--dac-in-secure-cert

flashed using:

esptool.py -p /dev/ttyACM0 write_flash 0x10000 out/fff2_8001/37d58392-35db-4ef8-a8ac-9e9576c982f0/*-partition.bin
esptool.py -p /dev/ttyACM0 write_flash 0xd000 out/fff2_8001/37d58392-35db-4ef8-a8ac-9e9576c982f0/*_esp_secure_cert.bin
hatl commented 2 weeks ago

Thanks to the support of @trailblazing-tech I could narrow it down further. The issue only occurs when using an ESP Thread Border Router. It works fine, when using a ZBT-1 (SkyConnect) USB dongle. Currently, I'm running v1.0. I'll continue testing using the latest commit in the main branch.

log.txt

wqx6 commented 1 week ago

Could you tell me the commit of ESP-IDF you are using for Thread BR? Looks like it is the issue of Thread BR? Are you using the esp-thread-br repo https://github.com/espressif/esp-thread-br? Or you are just using the example from ESP-IDF?

hatl commented 1 week ago

I tried v1.0 in the repo: https://github.com/espressif/esp-thread-br.git and idf v5.2.2

But I also tried using a more recent version: 6011f07ca9bd31686944b33422d86c1e4d429479 together with idf v5.3.0

wqx6 commented 1 week ago

Could you verify that the dataset on the end-device is the same as the dataset of the Thread BR? You can get the dataset of the end-device by input matter esp ot_cli dataset active -x in the end-device's console.