Closed kpark-apple closed 1 year ago
I looked back at the Apple border router log and found that updates were rejected by the Apple border router as well, most likely for the same reason as #18737. Hence, this is no issue for Nordic platform.
@Damian-Nordic I'm reopening this issue because @Abhayakara found an issue on Nordic side, too. Packet # 121 in the pcap file, the subtypes with TTL=0 must not be present, per https://www.ietf.org/archive/id/draft-ietf-dnssd-srp-13.html#name-handling-of-service-subtype . Specifically, "Similarly, there is no mechanism for deleting subtypes. A delete of a service deletes all of its subtypes. ".
Copied the packet #121 DNS records below.
Domain Name System (query)
Transaction ID: 0x7b33
Flags: 0x2800 Dynamic update
Zones: 1
Prerequisites: 0
Updates: 13
Additional RRs: 2
Zone
Updates
_matter._tcp.default.service.arpa: type PTR, class IN, 8E21999A5C0AD09A-00000000E0C9C955._matter._tcp.default.service.arpa
Name: _matter._tcp.default.service.arpa
Type: PTR (domain name PoinTeR) (12)
Class: IN (0x0001)
Time to live: 7200 (2 hours)
Data length: 36
Domain Name: 8E21999A5C0AD09A-00000000E0C9C955._matter._tcp.default.service.arpa
_I8E21999A5C0AD09A._sub._matter._tcp.default.service.arpa: type PTR, class IN, 8E21999A5C0AD09A-00000000E0C9C955._matter._tcp.default.service.arpa
Name: _I8E21999A5C0AD09A._sub._matter._tcp.default.service.arpa
Type: PTR (domain name PoinTeR) (12)
Class: IN (0x0001)
Time to live: 7200 (2 hours)
Data length: 2
Domain Name: 8E21999A5C0AD09A-00000000E0C9C955._matter._tcp.default.service.arpa
8E21999A5C0AD09A-00000000E0C9C955._matter._tcp.default.service.arpa: type ANY, class ANY
Name: 8E21999A5C0AD09A-00000000E0C9C955._matter._tcp.default.service.arpa
Type: * (A request for all records the server/cache has available) (255)
Class: ANY (0x00ff)
Time to live: 0 (0 seconds)
Data length: 0
8E21999A5C0AD09A-00000000E0C9C955._matter._tcp.default.service.arpa: type SRV, class IN, priority 0, weight 0, port 5540, target FAD789F07C0D4A01.default.service.arpa
Service: 8E21999A5C0AD09A-00000000E0C9C955
Protocol: _matter
Name: _tcp.default.service.arpa
Type: SRV (Server Selection) (33)
Class: IN (0x0001)
Time to live: 7200 (2 hours)
Data length: 25
Priority: 0
Weight: 0
Port: 5540
Target: FAD789F07C0D4A01.default.service.arpa
8E21999A5C0AD09A-00000000E0C9C955._matter._tcp.default.service.arpa: type TXT, class IN
Name: 8E21999A5C0AD09A-00000000E0C9C955._matter._tcp.default.service.arpa
Type: TXT (Text strings) (16)
Class: IN (0x0001)
Time to live: 7200 (2 hours)
Data length: 21
TXT Length: 8
TXT: CRI=5000
TXT Length: 7
TXT: CRA=300
TXT Length: 3
TXT: T=0
_matterc._udp.default.service.arpa: type PTR, class NONE, 068094B6B32F6112._matterc._udp.default.service.arpa
Name: _matterc._udp.default.service.arpa
Type: PTR (domain name PoinTeR) (12)
Class: NONE (0x00fe)
Time to live: 0 (0 seconds)
Data length: 19
Domain Name: 068094B6B32F6112._matterc._udp.default.service.arpa
_V65521._sub._matterc._udp.default.service.arpa: type PTR, class NONE, 068094B6B32F6112._matterc._udp.default.service.arpa
Name: _V65521._sub._matterc._udp.default.service.arpa
Type: PTR (domain name PoinTeR) (12)
Class: NONE (0x00fe)
Time to live: 0 (0 seconds)
Data length: 2
Domain Name: 068094B6B32F6112._matterc._udp.default.service.arpa
_S15._sub._matterc._udp.default.service.arpa: type PTR, class NONE, 068094B6B32F6112._matterc._udp.default.service.arpa
Name: _S15._sub._matterc._udp.default.service.arpa
Type: PTR (domain name PoinTeR) (12)
Class: NONE (0x00fe)
Time to live: 0 (0 seconds)
Data length: 2
Domain Name: 068094B6B32F6112._matterc._udp.default.service.arpa
_L3840._sub._matterc._udp.default.service.arpa: type PTR, class NONE, 068094B6B32F6112._matterc._udp.default.service.arpa
Name: _L3840._sub._matterc._udp.default.service.arpa
Type: PTR (domain name PoinTeR) (12)
Class: NONE (0x00fe)
Time to live: 0 (0 seconds)
Data length: 2
Domain Name: 068094B6B32F6112._matterc._udp.default.service.arpa
068094B6B32F6112._matterc._udp.default.service.arpa: type ANY, class ANY
Name: 068094B6B32F6112._matterc._udp.default.service.arpa
Type: * (A request for all records the server/cache has available) (255)
Class: ANY (0x00ff)
Time to live: 0 (0 seconds)
Data length: 0
FAD789F07C0D4A01.default.service.arpa: type ANY, class ANY
Name: FAD789F07C0D4A01.default.service.arpa
Type: * (A request for all records the server/cache has available) (255)
Class: ANY (0x00ff)
Time to live: 0 (0 seconds)
Data length: 0
FAD789F07C0D4A01.default.service.arpa: type AAAA, class IN, addr fd03:9558:1ab3:0:edf1:223f:4999:13ad
Name: FAD789F07C0D4A01.default.service.arpa
Type: AAAA (IPv6 Address) (28)
Class: IN (0x0001)
Time to live: 7200 (2 hours)
Data length: 16
AAAA Address: fd03:9558:1ab3:0:edf1:223f:4999:13ad
FAD789F07C0D4A01.default.service.arpa: type KEY, class IN
Name: FAD789F07C0D4A01.default.service.arpa
Type: KEY (security key) (25)
Class: IN (0x0001)
Time to live: 7200 (2 hours)
Data length: 68
Flags: 0x0201
Protocol: 3
Algorithm: ECDSA Curve P-256 with SHA-256 (13)
[Key ID: 40877]
Public Key: a02d07cbe1bd4da82e375211e4f3d3dad5e77ea2f932623f2940f0ddbffe9a3af55b2d41…
Additional records
<Root>: type OPT
Name: <Root>
Type: OPT (41)
UDP payload size: 1272
Higher bits in extended RCODE: 0x00
EDNS0 version: 0
Z: 0x8000
1... .... .... .... = DO bit: Accepts DNSSEC security RRs
.000 0000 0000 0000 = Reserved: 0x0000
Data length: 12
Option: UL - Update lease
<Root>: type SIG, class ANY
Name: <Root>
Type: SIG (security signature) (24)
Class: ANY (0x00ff)
Time to live: 0 (0 seconds)
Data length: 84
Type Covered: Unused (0)
Algorithm: ECDSA Curve P-256 with SHA-256 (13)
Labels: 0
Original TTL: 0 (0 seconds)
Signature Expiration: (0)Dec 31, 1969 16:00:00.000000000 PST
Signature Inception: (0)Dec 31, 1969 16:00:00.000000000 PST
Key Tag: 0
Signer's name: FAD789F07C0D4A01.default.service.arpa
Signature: 7fc400c594bef6b782fb832d34f4a81546568dd49f3eeddbfa129631dede49d8337fdd48…
[Response In: 124]
@Damian-Nordic could you please take another look at this issue as this particular issues is causing several failures in our testing and in general is the root cause of poor pairing reliability on Nordic platform?
@aajain-com The issue with sending PTR records with TTL=0 must be fixed in OpenThread before we can integrate it into the Nordic's SDK. Abtin commented in https://github.com/openthread/openthread/issues/7742 that he would plan to take a look at that. Is there a confirmation from Apple Border Router developers that the records are the reason why the SRP update is rejected?
@Damian-Nordic It was confirmed that if subtypes with TTL=0 are included, such update will be refused.
@jwhui @abtink Could you help me understand if this issue has been addressed in OpenThread already? It was also mentioned by Ted Lemon in https://github.com/openthread/openthread/issues/7742#issuecomment-1143769992, and then I saw https://github.com/openthread/openthread/pull/7795/files, but I couldn't find any commits concerning the SRP client part. Also, I cannot find any conditions preventing the client to send subtypes with TTL=0 in https://github.com/openthread/openthread/blob/7e7da0e21149925d42f26f2e563e15c0b17226b2/src/core/net/srp_client.cpp#L1004.
@Damian-Nordic, TLDR, I will add a PR in OT to change this to unblock this.
Some thoughts on this:
Some background on this:
It seems pretty simple to just not emit the subtype if its lifetime is zero.
Hi, I managed to test Apple border router with the Nordic Thread device containing the sub-type PTR removal fix from @abtink. It works better, as Apple border router accepts the service removal sent by the Matter firmware. Unfortunately it still rejects the removal using OpenThread CLI commands (ot srp client service remove ... ) with the same error message "[DL]Operation refused for security reasons". On the other hand service removal using both methods (by the Matter firmware and ot cli) works fine with the OpenThread Border Router, so still we have some behavior difference here.
To discover the difference between Matter firmware and OT cli I gathered pcaps from Wireshark. It looks like the Matter firmware doesn't simply remove single service, but also by the way updates another services. OT cli srp remove command just removes the single service, so my assumption is that Apple Border Router rejects the SRP updates that contain only service removal operation. Please see the wireshark screenshots:
SRP removal request by the OpenThread CLI remove command
SRP removal request by the Matter firmware:
@Abhayakara @kpark-apple do you have any ideas on why single remove operation could be rejected by the Apple TBR?
It looks like the OT cli is removing the service PTR and service instance name and adding the host, so you have a host with no services after this is done. This should be okay. I don't know if you're able to send me un-redacted logs from your Thread BR (you'd have to run this with the debugging profile and then generate a sysdiagnose).
Instructions for debugging profile here if you don't have them: https://download.developer.apple.com/iOS/tvOS_Logs/mDNSResponder_Logging_Instructions.pdf General page for info about debugging profiles here: https://developer.apple.com/bug-reporting/profiles-and-logs/
The instructions include how to generate a sysdiagnose and attach it e.g. to an email message or upload it. If you can do the above while reproducing the issue, I should be able to figure out why the update is being rejected by looking at the log. Bear in mind that all your DNS lookups are not private while this profile is installed.
@Abhayakara thanks for confirming the procedure looks correct at first glance.
Unfortunately I'm struggling with often stability issues related to the Matter beta firmware on iOS/tvOS, so basically setting things up and reproducing something takes me pretty lot of time. Currently I need to switch to another topic, so I will not be able to help in debugging for one or few days.
@kpark-apple could you or somebody from Apple engineers that is more fluent with debugging your devices help on that? The steps to reproduce are very easy:
Make sure this change is used in the accessory code: https://github.com/openthread/openthread/pull/7883/files#diff-0f22c3ae39951c9378d2e1305e1cf55e5750a5a584f007fcc93ab7c9e618ad35R1004 (e.g. by modifying it locally, as OpenThread upmerge was not done yet).
Invoke ot srp client service
command in the Nordic device console to see available services
Pick one of the services and try to remove it: ot srp client service remove <service_name> <instance_name>
, e.g. ot srp client service remove 3FA268ADA58260F2 _matterc._udp
See in TBR logs why the remove request was rejected by the router.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
This stale issue has been automatically closed. Thank you for your contributions.
Problem
Keep seeing the message in the above title when pairing Nordic lighting app and consequently failing the pairing. It occurred with both commit 9cc0fb36dbc1508159788cac6adf26322ec5f558 as well as with cfc35951be66a664a6efdadea56d1b8ea6e63e96.
Attached files are the Nordic log and Wireshark PCAP log, reproduced with cfc35951be66a664a6efdadea56d1b8ea6e63e96. The problem occurred between 7:25:19am PDT and 7:26:15am PDT.
802.15.4 decryption key: B38E612D11DC3BB798545AA8BC1C9D07 Key index: 0
20220517-nordic-refused.txt 20220517-Nordic-operation-refused-for-security-reasons.pcapng.zip
Proposed Solution
N/A