Open linas opened 5 years ago
Might be related: https://github.com/moby/moby/issues/32001
This issue looks like a lot of fun. Geth uses regular networking APIs, we do not use raw packets. I'm assuming that this is an issue with UDP packet routing across containers.
I cannot reproduce the moby bug, either because lxc does networking in a more standard way as compared to docker, or because my kernel version 4.19 doesn't have the routing bug (i.e. was fixed sometime earlier).
@fjl Am I to understand that geth uses UDP (not tcp) for chain download? Also: please note: even if "regular networking API's" are used, this does not avoid garbage packet sources. That is, the source (and not just destination) addresses are pulled from userland data structures, and if those structures have junk in them, that junk will get used as the source address. (In the case of C, for UDP packets, to send a UDP packet, one does bind(fd, struct sockaddr* src_addr); sendto(fd, struck sockaddr* dest_addr);
so if the contents of struct sockaddr src_addr
contained garbage, the UDP would still more-or-less work (packets will get delivered), but the intervening firewall will notice the invalid source and call them martians.)
Geth uses UDP to discover other nodes and TCP for downloads. Since this is a Go-based project, the way we use UDP is not quite comparable to C code.
We have moved this to the backlog because this is more of a nuisance and we don't consider it a serious issue for now. I will definitely investigate it more when I have some time.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Looking at this again, I still don't understand how UDP packets with such source addresses could be generated by geth.
If geth is running in a container that can connect to the discovery network, it will find other nodes and contact them via UDP. But the packets sent will come from the address assigned to geth.
System information
Geth version:
geth version
: 1.9.4-stable-46891c12 OS & Version: Windows/Linux/OSX: Ubuntu xenial running in LXC containerExpected behaviour
No martian network (tcp/ip) packets...
Actual behaviour
Observing Martian packets at network firewall.
Steps to reproduce the behaviour
The instructions below describe a container. I presume that a container is not actually necessary to reproduce the issue, but it is useful for isolating the network behavior. So
http://ppa.launchpad.net/ethereum/ethereum/ubuntu xenial/main amd64
in that container.~/.ethereum
so that a full download of the chain is forced.geth console
/bin/echo "1" > /proc/sys/net/ipv4/conf/all/log_martians
tail -f /var/log/syslog
and observe martian packetsSyslog entries
In the below,
10.0.3.246
is the internal-network IP address of the LXC container. The firewall has martian packet logging enabled. (by saying/bin/echo "1" > /proc/sys/net/ipv4/conf/all/log_martians
) Immediately upon the start ofgeth console
, as the chain starts getting downloaded, and the following appears in/var/log/syslog
:This continues the entire time, hours later, except now the IP addrs are more varied:
I've observed 896 of these so far...
That these come from
geth
can be easily established, as it is the only thing running in the container:That is, nothing else is using the IP addr
10.0.3.246
except forsystemd
andsshd
, and those don't generate martians; they only happen whilegeth
is running. I conclude thatgeth
is somehow hand-building tcp/ip packets, and somehow building them wrong (i.e. setting the packet source address to a bogus value) My guess is a use-before-initialization: that some structure is not zeroed before use, and the previous garbage IP addr in there gets used as an IP addr.