smoltcp-rs / smoltcp

a smol tcp/ip stack
BSD Zero Clause License
3.79k stars 425 forks source link

`smoltcp` does not respond to first ICMP echo request #597

Open kelleyk opened 2 years ago

kelleyk commented 2 years ago

When I boot a dev board running smoltcp@v0.8.0 and ping it, I see the following defmt output on the host:

DEBUG address 192.168.1.1 not in neighbor cache, sending ARP request
└─ smoltcp::iface::interface::{impl#3}::lookup_hardware_addr @ /home/kelleyk/.cargo/registry/src/github.com-1ecc6299db9ec823/smoltcp-0.8.0/src/macros.rs:18
DEBUG Failed to send response: Unaddressable
└─ smoltcp::iface::interface::{impl#2}::socket_ingress::{closure#0} @ /home/kelleyk/.cargo/registry/src/github.com-1ecc6299db9ec823/smoltcp-0.8.0/src/macros.rs:18
TRACE filled 192.168.1.1 => Ethernet(Address([172, 31, 107, 171, 181, 209])) (was empty)
└─ smoltcp::iface::neighbor::{impl#1}::fill @ /home/kelleyk/.cargo/registry/src/github.com-1ecc6299db9ec823/smoltcp-0.8.0/src/macros.rs:17

... and on the host side, I see no reply to that first ping:

$ ping 192.168.1.99
PING 192.168.1.99 (192.168.1.99) 56(84) bytes of data.
64 bytes from 192.168.1.99: icmp_seq=2 ttl=64 time=0.199 ms
64 bytes from 192.168.1.99: icmp_seq=3 ttl=64 time=0.189 ms

Is it possible that smoltcp is populating the ARP cache after processing the inbound ICMP echo request? If so, is this deliberate/known behavior, or would you be open to a patch that reordered things?

adamgreig commented 2 years ago

I ran into this issue a while ago and we discussed it on Matrix here; in short yes, the ARP request is transmitted in place of the ICMP echo response, and nothing remembers we received that echo request, so it's not sent once the ARP response is received.

Previously the ARP cache was more opportunistically populated, but this led to various horrible bugs, so 6210612be047ee706ac729015cdbc2581e6ae9a3 made ARP filling more restrictive. In the chat we discussed possibly allowing populating the ARP cache from received unicast packets, which I think would fix this, but I don't think has been done.

Dirbaio commented 2 years ago

I'm starting to think the original fix was too conservative, yup. Adding cache filling back in a super restricted form, like "only fill if src+dst MAC and IP are all unicast, and the DST ip+mac is ours" would hopefully work, and should cover all "first packet gets lost" cases.

Does anyone know what Linux does? Does it fill from packets like this, or does it buffer the ICMP response somewhere while it sends the ARP request?

thvdveld commented 11 months ago

Adding cache filling back in a super restricted form, like "only fill if src+dst MAC and IP are all unicast, and the DST ip+mac is ours" would hopefully work, and should cover all "first packet gets lost" cases.

This would not work in the case of RPL. When a node in the network receives a packet not destined for that node, it forwards the packet to its parent, without changing the source IP address. This means that that node transmits a message with its own link-layer address but with a source IP address that is not the one from the node. The parent node thus receives a message with a link-layer address that cannot be linked to that IP source address.

Edit: maybe we could actually add them how you say it, but also only when the hop limit or time to live is 64 or 255, reducing the chance that it is from a forwarded packet.

Edit 2: what about IPv6 packets transmitted to the ALL_NODES and ALL_RPL_NODES multicast addresses? In case of RPL, I would add them to the neighbor cache. E.g. a DIO (some kind of advertisement from a neighbor) should only be transmitted from a neighbor with the destination IP address being the ALL_RPL_NODES address.

ruza-net commented 1 month ago

Just to revive this issue, I also ran into this (only with TCP sockets), and while it doesn't break my application, it makes it significantly more annoying to use. A fix would be greatly welcome :)