OpenCyphal / pycyphal

Python implementation of the Cyphal protocol stack.
https://pycyphal.readthedocs.io/
MIT License
117 stars 105 forks source link

Cyphal/UDP: LinkLayerSniffer shall only accept frames from a specific NIC #314

Open pavel-kirienko opened 10 months ago

pavel-kirienko commented 10 months ago

The UDP tracer reports invalid transfer-ID errors as follows:

2023-09-01 23:54:02 1930786 DEB pycyphal.transport.redundant._tracer: RedundantTracer({(<class 'pycyphal.transport.udp._udp.UDPTransport'>, 139945704416784): <pycyphal.transport.udp._tracer.UDPTracer object at 0x7f47a4899010>, (<class 'pycyphal.transport.udp._udp.UDPTransport'>, 139945717383696): <pycyphal.transport.udp._tracer.UDPTracer object at 0x7f47a71d3b90>}): Processing RedundantCapture(timestamp=Timestamp(system_ns=1693601642728726000, monotonic_ns=993420910877439), inferior=UDPCapture(timestamp=Timestamp(system_ns=1693601642728726000, monotonic_ns=993420910877439), link_layer_packet=LinkLayerPacket(protocol=2, source=01005e001d55, destination=00248c4216fd, payload=4500003fb5a040001011e647c0a801c8ef001d55952a24a6002bcf0201043f8fffff551d3e000000000000000000008000003bfa010000000000569e7bb9a4)), iface_id=139945717383696, transfer_id_modulo=18446744073709551616)
2023-09-01 23:54:02 1930786 DEB pycyphal.transport.redundant._tracer: RedundantTracer({(<class 'pycyphal.transport.udp._udp.UDPTransport'>, 139945704416784): <pycyphal.transport.udp._tracer.UDPTracer object at 0x7f47a4899010>, (<class 'pycyphal.transport.udp._udp.UDPTransport'>, 139945717383696): <pycyphal.transport.udp._tracer.UDPTracer object at 0x7f47a71d3b90>}): BYPASS: UDPErrorTrace(timestamp=Timestamp(system_ns=1693601642728726000, monotonic_ns=993420910877439), error=<Error.UNEXPECTED_TRANSFER_ID: 2>)

Same but with manual line breaks:

2023-09-01 23:54:02 1930786 DEB pycyphal.transport.redundant._tracer:
    RedundantTracer({(<class 'pycyphal.transport.udp._udp.UDPTransport'>, 139945704416784): <pycyphal.transport.udp._tracer.UDPTracer object at 0x7f47a4899010>, (<class 'pycyphal.transport.udp._udp.UDPTransport'>, 139945717383696): <pycyphal.transport.udp._tracer.UDPTracer object at 0x7f47a71d3b90>}):
        Processing RedundantCapture(
            timestamp=Timestamp(system_ns=1693601642728726000, monotonic_ns=993420910877439),
                inferior=UDPCapture(
                    timestamp=Timestamp(system_ns=1693601642728726000, monotonic_ns=993420910877439),
                    link_layer_packet=LinkLayerPacket(
                        protocol=2,
                        source=01005e001d55,
                        destination=00248c4216fd,
                        payload=4500003fb5a040001011e647c0a801c8ef001d55952a24a6002bcf0201043f8fffff551d3e000000000000000000008000003bfa010000000000569e7bb9a4
                    )
                ),
                iface_id=139945717383696,
                transfer_id_modulo=18446744073709551616)

2023-09-01 23:54:02 1930786 DEB pycyphal.transport.redundant._tracer:
    RedundantTracer({(<class 'pycyphal.transport.udp._udp.UDPTransport'>, 139945704416784):
        <pycyphal.transport.udp._tracer.UDPTracer object at 0x7f47a4899010>, (<class 'pycyphal.transport.udp._udp.UDPTransport'>, 139945717383696): <pycyphal.transport.udp._tracer.UDPTracer object at 0x7f47a71d3b90>}):
            BYPASS:
                UDPErrorTrace(
                    timestamp=Timestamp(system_ns=1693601642728726000,
                    monotonic_ns=993420910877439),
                    error=<Error.UNEXPECTED_TRANSFER_ID: 2>
                )

Reproducible with:

UAVCAN__UDP__IFACE="127.0.0.1 192.168.1.200" UAVCAN__NODE__ID=12345 y -vv mon 2>error.log

Not reproducible without interface redundancy.

pavel-kirienko commented 10 months ago

This issue affects only the capture mode, hence its severity is not high. Normal operation is not affected.

The reason is that when a UDP transport instance is requested to commence capture, it launches a LinkLayerSniffer with the BPF expression udp and dst net 239.0.0.0/15:

LinkLayerSniffer(filter_expression='udp and dst net 239.0.0.0/15', num_devices=3, num_devices_active=3)

If there are N>1 instances, which is the case if redundant UDP interfaces are used, there will be N sniffers running in parallel, each configured to collect frames matching the BPF expression from all suitable network interfaces, causing each frame to be duplicated N times.

The solution is to confine each instance of LinkLayerSniffer to its own interface only. This requires querying the networking stack for the mapping between a local iface address and its NIC name. This does not address the case when there is more than one address per local interface.

ALSO: The BPF expression should filter on the destination port number as well:

udp and dst net 239.0.0.0/15 and dst port 9382
pavel-kirienko commented 10 months ago

Related setup log with two interfaces configured as shown above; observe the same LinkLayerSniffer being set up twice with identical parameters:

2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._v4: Constructed BPF filter expression: 'udp and dst net 239.0.0.0/15'
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Synthetic device 'any' does not support promiscuous mode, skipping
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Capturable network devices: ['enp6s0', 'vcan0', 'vcan1', 'vcan2', 'lo', 'enp4s0', 'wlp8s0', 'bluetooth-monitor', 'nflog', 'nfqueue', 'dbus-system', 'dbus-session']
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'enp6s0' is configured to use the data link type 1
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'vcan0' supports none of the following data link types: [1, 108, 0]. Last error was: NULL is not one of the DLTs supported by this device
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'vcan0'
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'vcan1' supports none of the following data link types: [1, 108, 0]. Last error was: NULL is not one of the DLTs supported by this device
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'vcan1'
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._session._input: PromiscuousUDPInputSession(InputSessionSpecifier(data_specifier=MessageDataSpecifier(subject_id=7509), remote_node_id=None), PayloadMetadata(extent_bytes=12)): Received UDP packet of 35 bytes from ('127.0.0.1', 33549) containing frame: UDPFrame(priority=NOMINAL, transfer_id=84, index=0, end_of_transfer=True, payload=56000000000000f00b3428, source_node_id=65532, destination_node_id=None, data_specifier=MessageDataSpecifier(subject_id=7509), user_data=0)
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._session._input: PromiscuousUDPInputSession(InputSessionSpecifier(data_specifier=MessageDataSpecifier(subject_id=7509), remote_node_id=None), PayloadMetadata(extent_bytes=12)): Received UDP packet of 35 bytes from ('127.0.0.1', 33549) containing frame: UDPFrame(priority=NOMINAL, transfer_id=84, index=0, end_of_transfer=True, payload=56000000000000f00b3428, source_node_id=65532, destination_node_id=None, data_specifier=MessageDataSpecifier(subject_id=7509), user_data=0)
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'vcan2' supports none of the following data link types: [1, 108, 0]. Last error was: NULL is not one of the DLTs supported by this device
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'vcan2'
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'lo' is configured to use the data link type 1
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'enp4s0' is configured to use the data link type 1
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'wlp8s0' is not capturable because the iface is not up. That device is not up
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'wlp8s0'
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'bluetooth-monitor' supports none of the following data link types: [1, 108, 0]. Last error was: NULL is not one of the DLTs supported by this device
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'bluetooth-monitor'
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not activate capture on 'nflog': Generic error; NFULNL_CFG_CMD_PF_UNBIND: Operation not permitted
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'nflog'
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not activate capture on 'nfqueue': Generic error; NFQNL_CFG_CMD_PF_UNBIND: Operation not permitted
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'nfqueue'
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'dbus-system' supports none of the following data link types: [1, 108, 0]. Last error was: NULL is not one of the DLTs supported by this device
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'dbus-system'
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'dbus-session' supports none of the following data link types: [1, 108, 0]. Last error was: NULL is not one of the DLTs supported by this device
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'dbus-session'
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Capture sessions with filter 'udp and dst net 239.0.0.0/15' have been set up on: ['enp6s0', 'lo', 'enp4s0']
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: LinkLayerSniffer(filter_expression='udp and dst net 239.0.0.0/15', num_devices=3, num_devices_active=3): Worker thread for 'enp6s0' is started: <Thread(pcap_enp6s0, started daemon 139945110812352)>
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: LinkLayerSniffer(filter_expression='udp and dst net 239.0.0.0/15', num_devices=3, num_devices_active=3): Worker thread for 'lo' is started: <Thread(pcap_lo, started daemon 139945102419648)>
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: LinkLayerSniffer(filter_expression='udp and dst net 239.0.0.0/15', num_devices=3, num_devices_active=3): Worker thread for 'enp4s0' is started: <Thread(pcap_enp4s0, started daemon 139945094026944)>
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._udp: UDPTransport('192.168.1.200', local_node_id=36671, service_transfer_multiplier=1, mtu=1200): Starting UDP/IP packet capture (hope you have permissions)
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._v4: Constructed BPF filter expression: 'udp and dst net 239.0.0.0/15'
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Synthetic device 'any' does not support promiscuous mode, skipping
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Capturable network devices: ['enp6s0', 'vcan0', 'vcan1', 'vcan2', 'lo', 'enp4s0', 'wlp8s0', 'bluetooth-monitor', 'nflog', 'nfqueue', 'dbus-system', 'dbus-session']
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'enp6s0' is configured to use the data link type 1
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'vcan0' supports none of the following data link types: [1, 108, 0]. Last error was: NULL is not one of the DLTs supported by this device
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'vcan0'
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'vcan1' supports none of the following data link types: [1, 108, 0]. Last error was: NULL is not one of the DLTs supported by this device
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'vcan1'
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'vcan2' supports none of the following data link types: [1, 108, 0]. Last error was: NULL is not one of the DLTs supported by this device
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'vcan2'
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'lo' is configured to use the data link type 1
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'enp4s0' is configured to use the data link type 1
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'wlp8s0' is not capturable because the iface is not up. That device is not up
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'wlp8s0'
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'bluetooth-monitor' supports none of the following data link types: [1, 108, 0]. Last error was: NULL is not one of the DLTs supported by this device
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'bluetooth-monitor'
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not activate capture on 'nflog': Generic error; NFULNL_CFG_CMD_PF_UNBIND: Operation not permitted
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'nflog'
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not activate capture on 'nfqueue': Generic error; NFQNL_CFG_CMD_PF_UNBIND: Operation not permitted
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'nfqueue'
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'dbus-system' supports none of the following data link types: [1, 108, 0]. Last error was: NULL is not one of the DLTs supported by this device
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'dbus-system'
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: Device 'dbus-session' supports none of the following data link types: [1, 108, 0]. Last error was: NULL is not one of the DLTs supported by this device
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Could not set up capture on 'dbus-session'
2023-09-01 23:54:01 1930786 INF pycyphal.transport.udp._ip._link_layer: Capture sessions with filter 'udp and dst net 239.0.0.0/15' have been set up on: ['enp6s0', 'lo', 'enp4s0']
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: LinkLayerSniffer(filter_expression='udp and dst net 239.0.0.0/15', num_devices=3, num_devices_active=3): Worker thread for 'enp6s0' is started: <Thread(pcap_enp6s0, started daemon 139944865560256)>
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: LinkLayerSniffer(filter_expression='udp and dst net 239.0.0.0/15', num_devices=3, num_devices_active=3): Worker thread for 'lo' is started: <Thread(pcap_lo, started daemon 139944857167552)>
2023-09-01 23:54:01 1930786 DEB pycyphal.transport.udp._ip._link_layer: LinkLayerSniffer(filter_expression='udp and dst net 239.0.0.0/15', num_devices=3, num_devices_active=3): Worker thread for 'enp4s0' is started: <Thread(pcap_enp4s0, started daemon 139944848774848)>
2023-09-01 23:54:01 1930786 INF yakut.cmd.monitor: Starting plug-and-play allocator using file '/tmp/discard'