project-chip / rs-matter

Rust implementation of the Matter protocol. Status: Experimental
Apache License 2.0
303 stars 43 forks source link

Keep the local fabric idx in subscriptions #181

Closed ivmarkov closed 1 month ago

ivmarkov commented 1 month ago

Necessary for fixing this issue - yet - not enough. Why - read below.

I'm PR-ing this part for now because I think this is uncontroversial and is necessary anyway.

Problem: I forgot that the peer node ID is only unique in the context of a concrete fabric, so if we want to uniquely identify a node, this is done by the pair (fabric-index/fabric-id, node-id) and not just node-id.

The PR is slightly bigger, as I took the liberty to rename node_id to peer_node_id in the subscription context, as well as in the called transport methods. I think this is worthwhile, or else node_id can be interpreted as our node ID, which is incorrect of course.

=================

Why is this fix not enough? (I'm seeking your advise what could be the problem.)

Apple Matter is a bit weird. Putting aside the fact that we are commissioned into two fabrics, the other weird thing that happens is that once we are commissioned in the second fabric, they are communicating with us on behalf of TWO different IP addresses, which - however - do represent the same node+fabric ID?!

And now the real weird part: we receive a subscription request from IP Address 2 and this all goes well. But when the time comes to report on that subscription in some seconds / minutes and initiate a new exchange, it just so happens that we select an existing session for "IP Address 1" instead, because we are just looking for a session matching the fabric index (=1) and the node ID. And since the first address also has created sessions to us on behalf of fabric 1, we just happen to pick a session for IP Address 1, even if the subscription came from IP Address 2.

... and get "Invalid Subscription" from the other peer.

Now, if we pick a session based on fabric index + peer node ID + remote address, then we select a session with "IP Address 2" (the address from where the subscription came) and all is well!

So @kedars what do you think is going on? Shall I just also keep the IP address in the subscription data and do a match by that address too? This works (confirmed) but sounds a bit weird. Am I missing something in the spec?

Session dump:

Sessions for IP Address 1:
=============================================

(Skip this block, it contains non-CASE sessions, not interesting):
Session: fb_idx: None,    peer: [fe80::8d7:cf23:8b0b:f715%2]:51398, peer_nodeid: Some(1088638830675445622), local_nodeid: 0, local: 0, remote: 0, msg_ctr: 9017568, mode: PlainText, ts: 1717612457.779132481s
Session: fb_idx: None,    peer: [fe80::8d7:cf23:8b0b:f715%2]:51398, peer_nodeid: Some(0), local_nodeid: 0, local: 1, remote: 8566, msg_ctr: 44517232, mode: Pase(NotExpected), ts: 1717612462.458154527s
Session: fb_idx: None,    peer: [fe80::8d7:cf23:8b0b:f715%2]:51398, peer_nodeid: Some(9297024357880017542), local_nodeid: 0, local: 0, remote: 0, msg_ctr: 87475376, mode: PlainText, ts: 1717612462.78951294s
Session: fb_idx: None,    peer: [fe80::8d7:cf23:8b0b:f715%2]:51398, peer_nodeid: Some(17849542785653146437), local_nodeid: 0, local: 0, remote: 0, msg_ctr: 35224863, mode: PlainText, ts: 1717612493.671745767s
Session: fb_idx: None,    peer: [fe80::8d7:cf23:8b0b:f715%2]:51398, peer_nodeid: Some(1088638830675445622), local_nodeid: 0, local: 0, remote: 0, msg_ctr: 9017568, mode: PlainText, ts: 1717612457.779132481s
Session: fb_idx: None,    peer: [fe80::8d7:cf23:8b0b:f715%2]:51398, peer_nodeid: Some(0), local_nodeid: 0, local: 1, remote: 8566, msg_ctr: 44517232, mode: Pase(NotExpected), ts: 1717612462.458154527s
Session: fb_idx: None,    peer: [fe80::8d7:cf23:8b0b:f715%2]:51398, peer_nodeid: Some(9297024357880017542), local_nodeid: 0, local: 0, remote: 0, msg_ctr: 87475376, mode: PlainText, ts: 1717612462.78951294s
Session: fb_idx: None,    peer: [fe80::8d7:cf23:8b0b:f715%2]:51398, peer_nodeid: Some(17849542785653146437), local_nodeid: 0, local: 0, remote: 0, msg_ctr: 35224863, mode: PlainText, ts: 1717612493.671745767s

!!! This block is interesting:
Session: fb_idx: Some(1), peer: [fe80::8d7:cf23:8b0b:f715%2]:51398, peer_nodeid: Some(864992019), local_nodeid: 1121512030, local: 2, remote: 8567, msg_ctr: 198121502, mode: Case(CaseDetails { fab_idx: 1, cat_ids: [0, 0, 0] }), ts: 1717612496.261807843s
Session: fb_idx: Some(2), peer: [fe80::8d7:cf23:8b0b:f715%2]:51398, peer_nodeid: Some(3725013030), local_nodeid: 4012158658, local: 3, remote: 8568, msg_ctr: 254531801, mode: Case(CaseDetails { fab_idx: 2, cat_ids: [0, 0, 0] }), ts: 1717612494.724025958s
Session: fb_idx: Some(1), peer: [fe80::8d7:cf23:8b0b:f715%2]:51398, peer_nodeid: Some(864992019), local_nodeid: 1121512030, local: 2, remote: 8567, msg_ctr: 198121502, mode: Case(CaseDetails { fab_idx: 1, cat_ids: [0, 0, 0] }), ts: 1717612500.816538095s
Session: fb_idx: Some(2), peer: [fe80::8d7:cf23:8b0b:f715%2]:51398, peer_nodeid: Some(3725013030), local_nodeid: 4012158658, local: 3, remote: 8568, msg_ctr: 254531801, mode: Case(CaseDetails { fab_idx: 2, cat_ids: [0, 0, 0] }), ts: 1717612494.724025958s

Sessions for IP Address 2:
=============================================
(Skip this block, it contains non-CASE sessions, not interesting):
Session: fb_idx: None,    peer: [fe80::416:93dd:216a:d50a%2]:65218, peer_nodeid: Some(1162526195568380034), local_nodeid: 0, local: 0, remote: 0, msg_ctr: 84020839, mode: PlainText, ts: 1717612496.321803477s
Session: fb_idx: None,    peer: [fe80::416:93dd:216a:d50a%2]:65218, peer_nodeid: Some(1162526195568380034), local_nodeid: 0, local: 0, remote: 0, msg_ctr: 84020839, mode: PlainText, ts: 1717612496.321803477s

!!! This block is interesting: see - fb_idx = 1 and peer_nodeid = 864992019 which is also present for Address 1?!
Session: fb_idx: Some(1), peer: [fe80::416:93dd:216a:d50a%2]:65218, peer_nodeid: Some(864992019), local_nodeid: 1121512030, local: 4, remote: 48265, msg_ctr: 226861159, mode: Case(CaseDetails { fab_idx: 1, cat_ids: [0, 0, 0] }), ts: 1717612500.815073586s
Session: fb_idx: Some(1), peer: [fe80::416:93dd:216a:d50a%2]:65218, peer_nodeid: Some(864992019), local_nodeid: 1121512030, local: 4, remote: 48265, msg_ctr: 226861159, mode: Case(CaseDetails { fab_idx: 1, cat_ids: [0, 0, 0] }), ts: 1717612500.815073586s
ivmarkov commented 1 month ago

@andreilitvin ^^^

ivmarkov commented 1 month ago

Moving the IP-related follow up discussion here.