Closed kghost closed 2 years ago
@kpschoedel @yufengwangca @bzbarsky-apple @pan-apple
This bug is causing some serious problems, we should fix it before TE7
Confirm that the PeerAddress
, fetched from mDNS doesn't contain proper interface id.
Related: #11009
The choice not to specify an interface seems quite deliberate - from the DeviceController OnNodeIdResolved call:
// Only use the mDNS resolution's InterfaceID for addresses that are IPv6 LLA.
// For all other addresses, we should rely on the device's routing table to route messages sent.
// Forcing messages down an InterfaceId might fail. For example, in bridged networks like Thread,
// mDNS advertisements are not usually received on the same interface the peer is reachable on.
if (nodeData.mAddress.IsIPv6LinkLocal())
{
interfaceId = nodeData.mInterfaceId;
}
What actually has to be matched in order for the session to match. Ie, how is this address used? Matching on exact address seems fragile - IMO nodes should be free to send messages on whatever interface is most convenient.
From the spec (section 4.14.2.4. Exchange Message Processing)
Upon receipt of a message by the Exchange Layer, the message is matched to an existing Exchange.
A given message is part of an Exchange if it satisfies all the following criteria:
1. The Exchange ID of the message matches the Exchange ID of the Exchange,
2. The message has the I Flag set and its Source Node ID is the Initiator of the Exchange,
OR it does not have the I Flag set and its Destination Node ID is the Initiator of the Exchange.
3. The message was received over a secure session that is associated to the Fabric ID and Node ID
of the Peer Node.
^ Does matching the secure session require matching the ip address?
The mdns resolver does track the interface, but keep in mind that there are addresses that are reachable over multiple interfaces. Right now, Device and SecureSession assume a single address for a peer, and that address is currently whatever address was last in the mdns packet on whatever interface that was received on. If we rely on strict addresses to match sessions, we'll need to track the interfaces, make sure we don't overwrite on receipt of new mdns packets and also track the local address that we're tagging packets with. That sounds overly restrictive. My preference would be to have the Device class (and maybe even the SecureSession) track all the addresses for a device so it can fall back in case of reachability issues. We also have a gap on the other side because there's nothing to restrict the sender from selecting amongst their own addresses for the source. The device side really has no way to know all the available addresses from the controller side.
TL;DR - if there's a way to match exchange contexts without using the IP addresses, we should do that.
Is not about how to match the address, the address we are using to reply is wrong.
Expected behavior:
First session:
Controller 192.168.9.1:5541 -> 192.168.9.1:5540 Device
192.168.9.1:5541 <- 192.168.9.1:5540
The device cache this session (peer address 192.168.9.1:5541, local interface 192.168.9.1:5540)
Second session:
Controller 192.168.9.1:5541 -> 192.168.9.1:5540 Device
192.168.9.1:5541 <- 192.168.9.1:5540
Actual behavior
First session:
Controller 192.168.90.187:5541 -> 192.168.9.1:5540 Device
192.168.90.187:5541 <- 192.168.9.1:5540
The device cache this session (peer address 192.168.90.187:5541, local interface 192.168.9.1:5540)
Second session:
Controller 192.168.90.187:5541 -> 192.168.90.187:5540 Device (The session matches cached session, because the peer address is same)
192.168.90.187:5541 <- 192.168.9.1:5540 Error: The controller sends a packet to 192.168.90.187:5540 but got a response from 192.168.9.1:5540, it is unable to process the packet.
yeah, but what does it mean to use a "wrong" address. I can't find anywhere in the spec that says we're matching on address, so in theory anything that gets the message there should be ok.
For TE7, I suppose what we can do is have the device set the address only if another address is not already set.
The general challenge with that is that if the device really DOES need to switch its address, we'll be doing the wrong thing.
I suppose my question is really - given that the device does receive the message on the "incorrect" address, why is the message being rejected? What part of the code is taking the address into account? And how does that match to spec?
The part of the spec that matches on address is the matching for unencrypted sessions. That has to match on transport address, because there's nothing else to match on.
The real question here is this: how is the interface (and hence source address) selected based on the destination address in this case?
Perhaps I misunderstood then - is this happening during the case session establishment? Or when we start to send encrypted messages?
Reopen, because it still need a permanent fix
Spec PR, https://github.com/CHIP-Specifications/connectedhomeip-spec/issues/4674, is now merged so work on this can resume.
Problem
My testing box have 2 different active network interfaces:
The controller sends a message to 192.168.9.1, on the device side the message is received from 192.168.90.187. So the message is sent via the wrong socket.
Expected packet: 192.168.9.1:5541 -> 192.168.9.1:5540 Actual packet: 192.168.90.187:5541 -> 192.168.9.1:5540
we should response the message via the same interface where the message is received.
Scenario
This is triggered by
scripts/tests/test_suites.sh
During pairing phrase
Here is where the problem begins, in the following logs, the controller send a packet to 192.168.9.1, this address is discovered by mDNS, but the packet is sent via the wrong interface.
Controller side
Device side:
During test phrase
Controller sends message:
Device receives message and sends reply
Controller receives message:
Due to that the message is on a different UnauthenticatedSession, it is unable to match the session with the original exchange, so it is unable to handle the message.
Proposed Solution
UDP::SendMessage
should honor interface inPeerAddress
.