Open Survensa opened 2 years ago
After the device connects to the second network, it can't communicate with the chip-tool, which is on the original network, right? So the behavior here seems completely expected... @Survensa
Agree - the omission of the verification step here in the test plan is intentional. chip-tool won't receive a response on its network if the network has been changed.
I can look at that also but, I can see that extended and doesn't match ./chip-tool networkcommissioning connect-network hex:1161111128222222 1 0 --Breadcrumb 2 - extended panid 1161111128222222
./chip-tool networkcommissioning add-or-update-thread-network hex:0e080000000000010000000300000f35060004001fffe0020811111111282222220708fd5930b7d8b62f58051000112233445562778899aabbccddeeff030e4f70656e54687265746444656d6f010212340410005ed405199188e1a5c0dd607282d7480c0402a0fff8 22 0 - extended panid 1111111128222222
Cert Blocker Review: It appears that we may need to change the test plan. @cecille can you please file an issue against the test plan and close this one?
Thanks!
@doublemis1 Checked for extended panid but facing same issue :
./chip-tool networkcommissioning connect-network hex:1111111107222222 48 0 --Breadcrumb 2
./chip-tool networkcommissioning add-or-update-thread-network hex:0e080000000000010000000300000f35060004001fffe0020811111111072222220708fd9957a92dadc7aa051000112233445966778899aabbccddeeff030f4f70656e54687265616444656d6f6a0102123404106ad73489ca556d19f8c47fd2150e58e30c0402a0fff8 48 0
Full log : cnet-4.12 error.txt
Hi, I was able to reproduce it. The Border Router 1 and Border Router 2 are within the same network (not Thread Network) and may discover its services by DNS. Thats way after connect DUT to the Border Router 2 network, the DUT informs that SRP services already existis (existing on Border Router 1).
I: 375759 [DL]SRP Client was stopped, because current server is no longer detected.
I: 375767 [DL]SRP Client was started, detected server: fd2f:a875:aab9:3927:a4b1:7250:5381:4135
D: 375776 [DL]OpenThread State Changed (Flags: 0x200002a4)
D: 375819 [DL] Device Role: CHILD
D: 375822 [DL] Partition Id: 0xD192944
D: 375831 [DL]OpenThread State Changed (Flags: 0x00000003)
D: 375837 [DL] Thread Unicast Addresses:
D: 375881 [DL] fd11:db::275:5101:6228:531d/64 valid preferred
D: 375887 [DL] fd2f:a875:aab9:3927:0:ff:fe00:4c01/64 valid rloc
D: 375894 [DL] fd2f:a875:aab9:3927:5a87:1ca3:6d30:7f7e/64 valid
D: 375901 [DL] fe80::3481:909f:b6e4:e6d6/64 valid preferred
D: 375912 [DMG]ICR moving to [ Preparing]
D: 375916 [DMG]ICR moving to [AddingComm]
D: 375919 [DMG]ICR moving to [AddedComma]
D: 375925 [DMG]Endpoint 0, Cluster 0x0000_0030 update version to 192d259f
D: 375931 [DMG]Decreasing reference count for CommandHandler, remaining 0
I: 375940 [IN]Prepared secure message 0x20004ce4 to 0x000000000001B669 (1) of type 0x9 and protocolId (0, 1) on exchange 12594r with MessageCounter:108659917.
I: 375954 [IN]Sending encrypted msg 0x20004ce4 with MessageCounter:108659917 to 0x000000000001B669 (1) at monotonic time: 000000000005BC92 msec
D: 375969 [DMG]ICR moving to [CommandSen]
D: 375973 [DMG]ICR moving to [AwaitingDe]
D: 375977 [DL]Using Thread extended MAC for hostname.
I: 375983 [DIS]Advertise operational node 529BA477D61319E8-0000000000000001
I: 375992 [SVR]Operational advertising enabled
I: 376146 [EM]Received message of type 0x10 with protocolId (0, 0) and MessageCounter:246502512 on exchange 12594r
D: 376156 [EM]Found matching exchange: 12594r, Delegate: (nil)
D: 376163 [EM]Rxd Ack; Removing MessageCounter:108659917 from Retrans Table on exchange 12594r
D: 376172 [EM]Removed CHIP MessageCounter:108659917 from RetransTable on exchange 12594r
I: 376181 [DL]OpenThread SED interval set to 1000ms
**E: 377284 [DL]SRP update error: domain name or RRset is duplicated**
I was able to pass this test using only docker networks. In this solution, border routers are on separate networks. Equipment:
Creating docker networks:
Run Border routers
*Note web GUI for otbr1 will be avaliable on 0.0.0.0:8080, for otbr2 0.0.0.0:8081
Form networks: Form networks on both Border Routers with different datasets.
Proceed the test.
Currently, the SRP Replication feature is not supported on the current Thread 1.3, hence the border routers from two separate Thread Networks can discover its services within a local network (using backbone interface) but synchronization and replication handling are not implemented.
@doublemis1 I am able to connect to second network through second OTBR but the issue is I am not able to proceed further, I was forced to restart the whole setup.
What is the issue? Please provide logs and steps how to reproduce
@doublemis1 Once after the second network is connected, I am no able to execute any command as it end with Timeout error
Attached both chiptool and DUT log for referance :
Thanks @doublemis1
Followed the steps to get connected and able to run the test case but not able to connect with first network once after disarming the Armfailsafe and ends with Timeout error
Error log :
I: 583601 [TS]Previous Last Known Good Time: 2022-07-29T05:46:32 I: 583607 [TS]Reverted Last Known Good Time to previous value D: 583613 [DMG]Endpoint 0, Cluster 0x0000_0030 update version to 9380461a D: 583621 [ZCL]Failsafe timeout, tell platform driver to revert network credentials. E: 583629 [NP]Found Thread configuration backup: reverting configuration E: 583723 [DL]Long dispatch time: 256 ms, for event type 32782 I: 583729 [FS]Fail-safe cleanly disarmed D: 583732 [DL]OpenThread State Changed (Flags: 0x11002046) D: 583738 [DL] Device Role: DETACHED D: 583741 [DL] Thread Unicast Addresses: D: 583745 [DL] fdd0:6b61:12a4:6da6:38a8:de10:b0ba:7e89/64 valid D: 583751 [DL] fe80::c831:832e:c849:8deb/64 valid preferred D: 583757 [DL] fd11:92::a1b:f16d:d7f3:d2f2/64 valid preferred D: 583766 [DL]OpenThread State Changed (Flags: 0x10034000) D: 583771 [DL] Network Name: OpenThreadDemo D: 583775 [DL] PAN Id: 0x1234 D: 583780 [DL] Extended PAN Id: 0x2111111122222222 D: 583785 [DL] Channel: 11 D: 583788 [DL] Mesh Prefix: fdd0:6b61:12a4:6da6::/64 I: 583796 [DL]CHIPoBLE advertising disabled because device is fully provisioned D: 583803 [DL]OpenThread State Changed (Flags: 0x1100107d) D: 583809 [DL] Device Role: DETACHED D: 583812 [DL] Thread Unicast Addresses: D: 583816 [DL] fdd0:6b61:12a4:6da6:38a8:de10:b0ba:7e89/64 valid D: 583822 [DL] fe80::c831:832e:c849:8deb/64 valid preferred D: 583828 [DL] fd11:92::a1b:f16d:d7f3:d2f2/64 valid preferred I: 584675 [DL]SRP Client was stopped, because current server is no longer detected. I: 584682 [DL]SRP Client was started, detected server: fdd0:6b61:12a4:6da6:3720:f78e:b659:724e D: 584692 [DL]OpenThread State Changed (Flags: 0x200002a4) D: 584734 [DL] Device Role: CHILD D: 584737 [DL] Partition Id: 0x17775288 D: 584746 [DL]OpenThread State Changed (Flags: 0x00000003) D: 584752 [DL] Thread Unicast Addresses: D: 584796 [DL] fd11:13::a8e0:a386:f9a4:fde0/64 valid preferred D: 584804 [DL] fdd0:6b61:12a4:6da6:0:ff:fe00:1003/64 valid rloc D: 584810 [DL] fdd0:6b61:12a4:6da6:38a8:de10:b0ba:7e89/64 valid D: 584817 [DL] fe80::c831:832e:c849:8deb/64 valid preferred D: 585777 [DL]SRP update succeeded
@woody-apple - can you clarify what test plan changes are needed here?
@cecille After a ConnectNetwork, the test plan expects to still receive the response to the original command on the old network that the device is no longer connected to.
@bzbarsky-apple, currently (at least nrfconnect platform) transmits a response to the old network, and right after sending the response DUT, switches to the new network.
Connect network command is sent in steps 7 and 16. Neither one of those steps has any expected outcome specified. I suppose it might have been better to add an explicit note that no response is required, but I don't think it's wrong.
I think the fact that the chip tool command is failing is scaring people, but the test plan doesn't specify that the response needs to come - that's a chip-tool assumption. So I think this can be fixed in the verification steps.
I test it right now, and the device without any issue connects to the second Thread Network and after the arm-fail-safe timeout, the device goes back to the first Thread Network. The issue is with advertised operation services.
= otbr20 IPv6 E5770245A0D6B77C-0000000000000001 _matter._tcp local
hostname = [0A82FF282505E2CC.local]
address = [fd11:ad::a2ae:53a0:7c81:5aac]
port = [5540]
txt = ["T=0" "SAI=500" "SII=6000"]
= otbr0 IPv6 E5770245A0D6B77C-0000000000000001 _matter._tcp local
hostname = [0A82FF282505E2CC.local]
address = [fd11:4b::6414:7609:b69a:b6e0]
port = [5540]
txt = ["T=0" "SAI=500" "SII=6000"]
The device doesn't remove the SRP service from the first OTBR (Thread Network) while connecting to the second OTBR. And while the device is going back to the first Thread Network, DUT doesn't remove the SRP service, thus from the controller's perspective, it needs to choose one out of two existing operational services.
The controller wants to connect the device
[1660056811.552199][59852:59857] CHIP:DL: Avahi resolve found
[1660056811.552282][59852:59857] CHIP:DIS: Node ID resolved for E5770245A0D6B77C:0000000000000001
[1660056811.552301][59852:59857] CHIP:DIS: Hostname: 0A82FF282505E2CC
[1660056811.552321][59852:59857] CHIP:DIS: IP Address #1: fd11:ad::a2ae:53a0:7c81:5aac
[1660056811.552333][59852:59857] CHIP:DIS: Port: 5540
[1660056811.552345][59852:59857] CHIP:DIS: Mrp Interval idle: 6000 ms
[1660056811.552355][59852:59857] CHIP:DIS: Mrp Interval active: 500 ms
[1660056811.552362][59852:59857] CHIP:DIS: TCP Supported: 0
[1660056811.553013][59852:59857] CHIP:DIS: Lookup clearing interface for non LL address
While the current device dataset is
uart:~$ ot dataset active
Active Timestamp: 1
Channel: 17
Channel Mask: 0x07fff800
Ext PAN ID: 1111111122222254
Mesh Local Prefix: fd32:1a2f:ba8c:69df::/64
Network Key: 00112233445566778899aabbccddeeff
Network Name: BR_1
PAN ID: 0xabbd
PSKc: e7d33a45227eb3ea460566cac9c96a79
Security Policy: 672 onrc
Done
uart:~$ ot ipaddr
fd11:4b:0:0:6414:7609:b69a:b6e0
fd32:1a2f:ba8c:69df:0:ff:fe00:c001
fd32:1a2f:ba8c:69df:6879:1371:79f1:989a
fe80:0:0:0:882:ff28:2505:e2cc
Done
uart:~$
The stale SRP record thing sounds like https://github.com/project-chip/connectedhomeip/issues/21101
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
This stale issue has been automatically closed. Thank you for your contributions.
Problem
Steps 2 - 7: remove th_xpan network, add THREAD_2ND_OPERATIONALDATASET, Connect to THREAD_2ND_OPERATIONALDATASET After Step 7 :
./chip-tool networkcommissioning connect-network hex:1161111128222222 1 0 --Breadcrumb 2
not able to proceed further due to run command failure.Actual behavior after connect-network with THREAD_2ND_OPERATIONALDATASET
Steps to reproduce
./chip-tool networkcommissioning remove-network hex:1111111152222222 22 0
}./chip-tool networkcommissioning add-or-update-thread-network hex:0e080000000000010000000300000f35060004001fffe0020811111111282222220708fd5930b7d8b62f58051000112233445562778899aabbccddeeff030e4f70656e54687265746444656d6f010212340410005ed405199188e1a5c0dd607282d7480c0402a0fff8 22 0 (second network dataset value)
}System configuration
Log : CNET Thread.txt