ethereum / go-ethereum

Go implementation of the Ethereum protocol
https://geth.ethereum.org
GNU Lesser General Public License v3.0
47.54k stars 20.13k forks source link

martian network packets during `geth console` download of chain #20098

Open linas opened 5 years ago

linas commented 5 years ago

System information

Geth version: geth version: 1.9.4-stable-46891c12 OS & Version: Windows/Linux/OSX: Ubuntu xenial running in LXC container

Expected behaviour

No martian network (tcp/ip) packets...

Actual behaviour

Observing Martian packets at network firewall.

Steps to reproduce the behaviour

The instructions below describe a container. I presume that a container is not actually necessary to reproduce the issue, but it is useful for isolating the network behavior. So

Syslog entries

In the below, 10.0.3.246 is the internal-network IP address of the LXC container. The firewall has martian packet logging enabled. (by saying /bin/echo "1" > /proc/sys/net/ipv4/conf/all/log_martians) Immediately upon the start of geth console, as the chain starts getting downloaded, and the following appears in /var/log/syslog:

Sep 19 09:08:07 linas kernel: [43340.716094] IPv4: martian source 54.203.178.120 from 10.0.3.246, on dev eth0
Sep 19 09:08:07 linas kernel: [43340.716102] ll header: 00000000: d0 67 e5 00 69 42 40 16 7e 37 5f 46 08 00        .g..iB@.~7_F..
Sep 19 09:08:48 linas kernel: [43380.979351] IPv4: martian source 54.213.199.255 from 10.0.3.246, on dev eth0
Sep 19 09:08:48 linas kernel: [43380.979368] ll header: 00000000: d0 67 e5 00 69 42 40 16 7e 37 5f 46 08 00        .g..iB@.~7_F..
Sep 19 09:08:57 linas kernel: [43390.683454] IPv4: martian source 119.28.149.238 from 10.0.3.246, on dev eth0
Sep 19 09:08:57 linas kernel: [43390.683470] ll header: 00000000: d0 67 e5 00 69 42 40 16 7e 37 5f 46 08 00        .g..iB@.~7_F..
Sep 19 09:08:57 linas kernel: [43390.683478] IPv4: martian source 119.28.149.238 from 10.0.3.246, on dev eth0
Sep 19 09:08:57 linas kernel: [43390.683481] ll header: 00000000: d0 67 e5 00 69 42 40 16 7e 37 5f 46 08 00        .g..iB@.~7_F..
Sep 19 09:08:57 linas kernel: [43390.683486] IPv4: martian source 119.28.149.238 from 10.0.3.246, on dev eth0
Sep 19 09:08:57 linas kernel: [43390.683489] ll header: 00000000: d0 67 e5 00 69 42 40 16 7e 37 5f 46 08 00        .g..iB@.~7_F..
Sep 19 09:08:57 linas kernel: [43390.683494] IPv4: martian source 119.28.149.238 from 10.0.3.246, on dev eth0
Sep 19 09:08:57 linas kernel: [43390.683497] ll header: 00000000: d0 67 e5 00 69 42 40 16 7e 37 5f 46 08 00        .g..iB@.~7_F..
Sep 19 09:09:32 linas kernel: [43424.958311] IPv4: martian source 119.28.149.238 from 10.0.3.246, on dev eth0
Sep 19 09:09:32 linas kernel: [43424.958318] ll header: 00000000: d0 67 e5 00 69 42 40 16 7e 37 5f 46 08 00        .g..iB@.~7_F..
Sep 19 09:09:32 linas kernel: [43424.958327] IPv4: martian source 119.28.149.238 from 10.0.3.246, on dev eth0
Sep 19 09:09:32 linas kernel: [43424.958331] ll header: 00000000: d0 67 e5 00 69 42 40 16 7e 37 5f 46 08 00        .g..iB@.~7_F..
Sep 19 09:09:32 linas kernel: [43424.958337] IPv4: martian source 119.28.149.238 from 10.0.3.246, on dev eth0
Sep 19 09:09:32 linas kernel: [43424.958340] ll header: 00000000: d0 67 e5 00 69 42 40 16 7e 37 5f 46 08 00        .g..iB@.~7_F..

This continues the entire time, hours later, except now the IP addrs are more varied:

Sep 19 15:56:20 linas kernel: [67833.793092] IPv4: martian source 35.237.223.228 from 10.0.3.246, on dev eth0
Sep 19 15:56:20 linas kernel: [67833.793096] ll header: 00000000: d0 67 e5 00 69 42 40 16 7e 37 5f 46 08 00        .g..iB@.~7_F..
Sep 19 15:57:04 linas kernel: [67878.109294] IPv4: martian source 52.42.152.99 from 10.0.3.246, on dev eth0
Sep 19 15:57:04 linas kernel: [67878.109301] ll header: 00000000: d0 67 e5 00 69 42 40 16 7e 37 5f 46 08 00        .g..iB@.~7_F..
Sep 19 16:02:06 linas kernel: [68179.989823] IPv4: martian source 91.197.44.140 from 10.0.3.246, on dev eth0
Sep 19 16:02:06 linas kernel: [68179.989831] ll header: 00000000: d0 67 e5 00 69 42 40 16 7e 37 5f 46 08 00        .g..iB@.~7_F..

I've observed 896 of these so far...

That these come from geth can be easily established, as it is the only thing running in the container:

ubuntu@etherium:~$ ps ax
  PID TTY      STAT   TIME COMMAND
    1 ?        Ss     0:02 /lib/systemd/systemd --system --deserialize 20
   15 ?        Ss     0:00 /lib/systemd/systemd-journald
   50 ?        Ss     0:00 /usr/sbin/cron -f
   53 ?        Ss     0:01 /usr/bin/dbus-daemon --system --address=systemd: --no
   57 ?        Ss     0:00 /lib/systemd/systemd-logind
  127 ?        Ss     0:00 /sbin/dhclient -1 -v -pf /run/dhclient.eth0.pid -lf /
  152 ?        Ss+    0:00 /sbin/agetty --noclear --keep-baud console 115200 384
  153 pts/1    Ss+    0:00 /sbin/agetty --noclear --keep-baud pts/1 115200 38400
  154 pts/0    Ss+    0:00 /sbin/agetty --noclear --keep-baud pts/0 115200 38400
  155 pts/3    Ss+    0:00 /sbin/agetty --noclear --keep-baud pts/3 115200 38400
  156 pts/2    Ss+    0:00 /sbin/agetty --noclear --keep-baud pts/2 115200 38400
  404 ?        Ss     0:00 sshd: ubuntu [priv]
  413 ?        S      0:16 sshd: ubuntu@pts/4
  414 pts/4    Ss     0:00 -bash
 2465 ?        Ss     0:00 /usr/sbin/sshd -D
 6569 ?        Ss     0:00 /lib/systemd/systemd-udevd
 7295 ?        Ssl    0:00 /usr/sbin/rsyslogd -n
12437 ?        Ss     0:00 sshd: ubuntu [priv]
12449 ?        R      0:00 sshd: ubuntu@pts/5
12450 pts/5    Ss     0:00 -bash
12922 pts/4    Sl+  566:20 geth console
13579 pts/5    R+     0:00 ps ax
ubuntu@etherium:~$ 

That is, nothing else is using the IP addr 10.0.3.246 except for systemd and sshd, and those don't generate martians; they only happen while geth is running. I conclude that geth is somehow hand-building tcp/ip packets, and somehow building them wrong (i.e. setting the packet source address to a bogus value) My guess is a use-before-initialization: that some structure is not zeroed before use, and the previous garbage IP addr in there gets used as an IP addr.

holiman commented 5 years ago

Might be related: https://github.com/moby/moby/issues/32001

fjl commented 5 years ago

This issue looks like a lot of fun. Geth uses regular networking APIs, we do not use raw packets. I'm assuming that this is an issue with UDP packet routing across containers.

linas commented 5 years ago

I cannot reproduce the moby bug, either because lxc does networking in a more standard way as compared to docker, or because my kernel version 4.19 doesn't have the routing bug (i.e. was fixed sometime earlier).

@fjl Am I to understand that geth uses UDP (not tcp) for chain download? Also: please note: even if "regular networking API's" are used, this does not avoid garbage packet sources. That is, the source (and not just destination) addresses are pulled from userland data structures, and if those structures have junk in them, that junk will get used as the source address. (In the case of C, for UDP packets, to send a UDP packet, one does bind(fd, struct sockaddr* src_addr); sendto(fd, struck sockaddr* dest_addr); so if the contents of struct sockaddr src_addr contained garbage, the UDP would still more-or-less work (packets will get delivered), but the intervening firewall will notice the invalid source and call them martians.)

fjl commented 5 years ago

Geth uses UDP to discover other nodes and TCP for downloads. Since this is a Go-based project, the way we use UDP is not quite comparable to C code.

We have moved this to the backlog because this is more of a nuisance and we don't consider it a serious issue for now. I will definitely investigate it more when I have some time.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

fjl commented 4 years ago

Looking at this again, I still don't understand how UDP packets with such source addresses could be generated by geth.

fjl commented 4 years ago

If geth is running in a container that can connect to the discovery network, it will find other nodes and contact them via UDP. But the packets sent will come from the address assigned to geth.