ethereum / go-ethereum

Go implementation of the Ethereum protocol
https://geth.ethereum.org
GNU Lesser General Public License v3.0
47.14k stars 19.95k forks source link

Private network collides with Ethereum's main net #15358

Closed etirium closed 5 years ago

etirium commented 6 years ago

System information

Geth version: 1.7.2 OS & Version: Linux Commit hash : n/a

Expected behaviour

Private network is not finding Ehtereum's main net nodes

Actual behaviour

Private network is discovering Ethereum's main net nodes which generates a lot of useless networking traffic since our networkid is different and the blockchain has different genesis block. This can not be stopped because once the net is contaminated with nodes of Ethereum's main net (and viceversa) the discovery process is accelerating and network traffic increases. We have changed params/bootnodes.go and removed all entries from var MainnetBootnodes[] but the problem persists despite of that.

Verbosity=6 shows these entries:

TRACE[10-23|00:57:39] Accepted connection                      addr=72.36.89.14:3523
TRACE[10-23|00:57:39] Failed RLPx handshake                    addr=72.36.89.14:3523      conn=inbound err="ecies: invalid message"

The geth log is filled with hundreds of such messages.

The ./bin/bootnode log shows messages like these:


TRACE[10-23|00:57:39] >> PING/v4                               addr=76.17.153.49:30399    err=nil
TRACE[10-23|00:57:39] << PONG/v4                               addr=207.154.207.136:30303 err=nil
TRACE[10-23|00:57:39] << PONG/v4                               addr=76.17.153.49:30399    err=nil
TRACE[10-23|00:57:39] >> PING/v4                               addr=217.121.155.166:30303 err=nil
TRACE[10-23|00:57:40] << PONG/v4                               addr=217.121.155.166:30303 err=nil
TRACE[10-23|00:57:40] >> NEIGHBORS/v4                          addr=173.56.43.56:30399    err=nil
TRACE[10-23|00:57:40] >> NEIGHBORS/v4                          addr=173.56.43.56:30399    err=nil
TRACE[10-23|00:57:40] << FINDNODE/v4                           addr=173.56.43.56:30399    err=nil

The above IP addresses do not belong to our network, they are from Ethereum's main net.

Starting with --nodiscover doesn't stop this behaviour, since Ethereum's main net already has our nodes in their list. But even if it worked, we can't use --nodiscover option because we would not discover nodes belonging to our own network. Or if some of our users forgets to add --nodiscover option when starting geth the problem would appear again.

Steps to reproduce the behaviour

During some early tests we used geth without --bootnodes option and it connected to Ethereum's main net with our own blockchain. Probably this has contaminated our cache. Another way this could happen is that during our test we mistakenly ran geth binary from Ethereum's distribution which has hardcoded bootnodes in it. Now both networks are colliding and trying to discover nodes that do not belong to them.

The question is, how do you make it so the private network does not interchange TCP/UDP traffic with another net? And how do you remove foreign nodes from cache once it has been contaminated?

karalabe commented 6 years ago

@fjl is working on speccing out the next version of the discovery protocol which should allow adding some tags to nodes so they can better track (or not) each other. The current discovery protocol was not designed for multiple networks, so there aren't any meaningful fail-safes to keep them separated.

immesys commented 6 years ago

In my private network I modified geth to include the network ID as a kind of salt when calculating the hash in encodePacket and decodePacket in p2p/discover/udp.go and p2p/discv5/udp.go.

It works extremely well, as at that level, peers on the wrong chain do not get asked for their peers, nor do they get stored.

nuliknol commented 6 years ago

@immesys, wouldn't it be easier to just modify the line pingPacket =iota + 1 for something like pingPacket=iota + 74 , where 74 is a random number? Only 1 line of code changed.

thomasyuan commented 6 years ago

I was wondering isn't it caused by hardcoded bootnodes? For example, when I run geth --datadir ./private --networkid 12345 dumpconfig, I got something like this:

...
[Node.P2P]
MaxPeers = 25
NoDiscovery = false
DiscoveryV5Addr = ":30304"
BootstrapNodes = ["enode://a979fb575495b8d6db44f750317d0f4622bf4c2aa3365d6af7c284339968eef29b69ad0dce72a4d8db5ebb4968de0e3bec910127f134779fbcb0cb6d3331163c@52.16.188.185:30303", "enode://3f1d12044546b76342d59d4a05532c14b85aa669704bfe1f864fe079415aa2c02d743e03218e57a33fb94523adb54032871a6c51b2cc5514cb7c7e35b3ed0a99@13.93.211.84:30303", "enode://78de8a0916848093c73790ead81d1928bec737d565119932b98c6b100d944b7a95e94f847f689fc723399d2e31129d182f7ef3863f2b4c820abbf3ab2722344d@191.235.84.50:30303", "enode://158f8aab45f6d19c6cbf4a089c2670541a8da11978a2f90dbf6a502a4a3bab80d288afdbeb7ec0ef6d92de563767f3b1ea9e8e334ca711e9f8e2df5a0385e8e6@13.75.154.138:30303", "enode://1118980bf48b0a3640bdba04e0fe78b1add18e1cd99bf22d53daac1fd9972ad650df52176e7c7d89d1114cfef2bc23a2959aa54998a46afcf7d91809f0855082@52.74.57.123:30303", "enode://979b7fa28feeb35a4741660a16076f1943202cb72b6af70d327f053e248bab9ba81760f39d0701ef1d8f89cc1fbd2cacba0710a12cd5314d5e0c9021aa3637f9@5.1.83.226:30303"]

From my point of view, If the user set networkid, the default bootnodes should be removed. Maybe it is removed in logic, but not from dumped configuration?

d10r commented 6 years ago

Does this mean that this claim (source) is not correct?

Since connections between nodes are valid only if peers have identical protocol version and network ID, you can effectively isolate your network by setting either of these to a non default value.

I had a lot of troubles with a custom testnet being spammed with connection attempts from alien nodes and for a while assumed something is wrong with the config.

How do nodes of public testnets (e.g. Rinkeby) avoid this issue? Or don't they?

karalabe commented 6 years ago

Unfortunately the old v4 of the discovery protocol isn't capable of managing different networks cleanly, so only upper level protocols reject connection (after doing the handshake).

Currently only light clients avoid this by using an experimental discovery protocol. Would be nice to standardize it, but even though it's annoying, the discovery protocol is a fairly low priority thing, since it kind of works, and there's always something more important to focus on.

sivachaitanya commented 6 years ago

@karalabe This is exactly what I'm seeing in my private network too which is leading to all the cpu cores usage to 100%

sivachaitanya commented 6 years ago

@immesys can you throw more light on how did you add the networkID for hash calculation for the encode packet and decodepacket functions in the udp.go files ?

immesys commented 6 years ago

@sivachaitanya I've used it for years now and there have been no adverse affects. I only peer with my own network's nodes.

Here are the full details of the hack. Modify p2p/discover/udp.go with a diff like this:

--- a/p2p/discover/udp.go
+++ b/p2p/discover/udp.go
@@ -34,6 +34,12 @@ import (

 const Version = 4

+// We don't want to inconvenience the ethereum people
+// by advertising tons of our peers to their network, when we
+// are on a different chain. The easiest way is to fail the
+// MAC check right at the start of the packet decode.
+var Salt = []byte{0x42, 0x4f, 0x53, 0x55, 0x57, 0x42, 0x56, 0x45}
+
 // Errors
 var (
        errPacketTooSmall   = errors.New("too small")
@@ -487,7 +493,7 @@ func encodePacket(priv *ecdsa.PrivateKey, ptype byte, req interface{}) ([]byte,
        // add the hash to the front. Note: this doesn't protect the
        // packet in any way. Our public key will be part of this hash in
        // The future.
-       copy(packet, crypto.Keccak256(packet[macSize:]))
+       copy(packet, crypto.Keccak256(packet[macSize:], Salt))
        return packet, nil
 }

@@ -529,7 +535,7 @@ func decodePacket(buf []byte) (packet, NodeID, []byte, error) {
                return nil, NodeID{}, nil, errPacketTooSmall
        }
        hash, sig, sigdata := buf[:macSize], buf[macSize:headSize], buf[headSize:]
-       shouldhash := crypto.Keccak256(buf[macSize:])
+       shouldhash := crypto.Keccak256(buf[macSize:], Salt)
        if !bytes.Equal(hash, shouldhash) {
                return nil, NodeID{}, nil, errBadHash
        }

I chose a random hardcoded salt. You can use the encoded network id or something

sivachaitanya commented 6 years ago

@immesys This is a fantastic piece of hack to get around the cpu usage and our private network connecting to alien nodes, Kudos for sharing the details , after we injected the salt we observe the below logs which is as expected -

DEBUG[09-06|04:06:48.021] Bad discv4 packet addr=66.147.230.39:20203 err="bad hash" DEBUG[09-06|04:06:48.030] Bad discv4 packet addr=142.93.73.78:30303 err="bad hash" DEBUG[09-06|04:06:48.247] Found seed node in database id=a979fb575495b8d6 addr=52.16.188.185:30303 age=426724h6m48.247995935s DEBUG[09-06|04:06:48.248] Found seed node in database id=3f1d12044546b763 addr=13.93.211.84:30303 age=426724h6m48.248054535s DEBUG[09-06|04:06:48.248] Found seed node in database id=78de8a0916848093 addr=191.235.84.50:30303 age=426724h6m48.248101935s DEBUG[09-06|04:06:48.248] Found seed node in database id=158f8aab45f6d19c addr=13.75.154.138:30303 age=426724h6m48.248131236s DEBUG[09-06|04:06:48.248] Found seed node in database id=1118980bf48b0a36 addr=52.74.57.123:30303 age=426724h6m48.248171936s DEBUG[09-06|04:06:48.248] Found seed node in database id=979b7fa28feeb35a addr=5.1.83.226:30303 age=426724h6m48.248197736s DEBUG[09-06|04:06:48.324] Bad discv4 packet addr=138.197.8.171:30303 err="bad hash" DEBUG[09-06|04:06:48.806] Bad discv4 packet addr=62.182.175.36:30303 err="bad hash" DEBUG[09-06|04:06:49.028] Bad discv4 packet addr=13.230.221.59:30303 err="bad hash" DEBUG[09-06|04:06:49.181] Bad discv4 packet addr=66.147.230.39:20203 err="bad hash" DEBUG[09-06|04:06:49.276] Bad discv4 packet addr=212.32.246.37:1056 err="bad hash" DEBUG[09-06|04:06:49.476] Bad discv4 packet addr=211.45.60.5:47417 err="bad hash" DEBUG[09-06|04:06:49.494] Bad discv4 packet addr=203.161.185.210:7253 err="bad hash" DEBUG[09-06|04:06:49.576] Bad discv4 packet addr=94.130.207.84:30303 err="bad hash" DEBUG[09-06|04:06:49.681] Bad discv4 packet addr=13.230.221.59:30303 err="bad hash" DEBUG[09-06|04:06:49.687] Bad discv4 packet addr=138.197.8.171:30303 err="bad hash" DEBUG[09-06|04:06:50.155] Bad discv4 packet addr=66.147.230.39:20203 err="bad hash" DEBUG[09-06|04:06:50.291] Bad discv4 packet addr=108.170.1.134:60415 err="bad hash" DEBUG[09-06|04:06:50.449] Bad discv4 packet addr=13.230.221.59:30303 err="bad hash" DEBUG[09-06|04:06:50.456] Bad discv4 packet addr=172.104.93.143:50000 err="bad hash" DEBUG[09-06|04:06:50.703] Bad discv4 packet addr=138.197.186.92:21006 err="bad hash" DEBUG[09-06|04:06:50.763] Bad discv4 packet addr=52.77.196.144:30303 err="bad hash" DEBUG[09-06|04:06:51.117] Bad discv4 packet addr=212.32.246.37:1056 err="bad hash" DEBUG[09-06|04:06:51.622] Bad discv4 packet addr=212.32.246.37:1056 err="bad hash" DEBUG[09-06|04:06:51.833] Bad discv4 packet addr=192.138.210.100:30303 err="bad hash" DEBUG[09-06|04:06:52.247] Found seed node in database id=a979fb575495b8d6 addr=52.16.188.185:30303 age=426724h6m52.247756573s DEBUG[09-06|04:06:52.247] Found seed node in database id=3f1d12044546b763 addr=13.93.211.84:30303 age=426724h6m52.247848774s DEBUG[09-06|04:06:52.247] Found seed node in database id=78de8a0916848093 addr=191.235.84.50:30303 age=426724h6m52.247871474s DEBUG[09-06|04:06:52.247] Found seed node in database id=158f8aab45f6d19c addr=13.75.154.138:30303 age=426724h6m52.247888174s DEBUG[09-06|04:06:52.247] Found seed node in database id=1118980bf48b0a36 addr=52.74.57.123:30303 age=426724h6m52.247903074s DEBUG[09-06|04:06:52.247] Found seed node in database id=979b7fa28feeb35a addr=5.1.83.226:30303 age=426724h6m52.247922674s DEBUG[09-06|04:06:52.688] Bad discv4 packet addr=104.156.227.21:60888 err="bad hash" DEBUG[09-06|04:06:53.183] Bad discv4 packet addr=103.59.166.76:30303 err="bad hash" DEBUG[09-06|04:06:53.486] Bad discv4 packet addr=2.45.160.70:42724 err="bad hash" DEBUG[09-06|04:06:53.988] Bad discv4 packet addr=51.140.84.38:30303 err="bad hash" DEBUG[09-06|04:06:54.132] Bad discv4 packet addr=192.138.210.100:30303 err="bad hash" DEBUG[09-06|04:06:54.402] Bad discv4 packet addr=35.164.10.186:45408 err="bad hash" DEBUG[09-06|04:06:54.966] Bad discv4 packet addr=47.91.214.63:30303 err="bad hash" DEBUG[09-06|04:06:54.985] Bad discv4 packet addr=54.95.80.53:1064 err="bad hash" DEBUG[09-06|04:06:55.052] Bad discv4 packet addr=45.43.30.2:38012 err="bad hash" DEBUG[09-06|04:06:55.062] Bad discv4 packet addr=51.140.84.38:30303 err="bad hash" DEBUG[09-06|04:06:55.171] Bad discv4 packet addr=2.45.160.70:42724 err="bad hash" DEBUG[09-06|04:06:55.317] Bad discv4 packet addr=52.39.176.45:17717 err="bad hash" DEBUG[09-06|04:06:55.586] Bad discv4 packet addr=47.74.233.145:30303 err="bad hash" DEBUG[09-06|04:06:55.741] Bad discv4 packet addr=115.111.61.164:30322 err="bad hash" DEBUG[09-06|04:06:55.821] Bad discv4 packet addr=52.39.176.45:17717 err="bad hash" DEBUG[09-06|04:06:56.248] Found seed node in database id=a979fb575495b8d6 addr=52.16.188.185:30303 age=426724h6m56.24824634s DEBUG[09-06|04:06:56.248] Found seed node in database id=3f1d12044546b763 addr=13.93.211.84:30303 age=426724h6m56.24831504s DEBUG[09-06|04:06:56.248] Found seed node in database id=78de8a0916848093 addr=191.235.84.50:30303 age=426724h6m56.24833304s DEBUG[09-06|04:06:56.248] Found seed node in database id=158f8aab45f6d19c addr=13.75.154.138:30303 age=426724h6m56.24834744s DEBUG[09-06|04:06:56.248] Found seed node in database id=1118980bf48b0a36 addr=52.74.57.123:30303 age=426724h6m56.24835964s DEBUG[09-06|04:06:56.248] Found seed node in database id=979b7fa28feeb35a addr=5.1.83.226:30303 age=426724h6m56.248371841s DEBUG[09-06|04:06:56.467] Bad discv4 packet addr=115.111.61.164:30322 err="bad hash" DEBUG[09-06|04:06:56.586] Bad discv4 packet addr=47.74.233.145:30303 err="bad hash" DEBUG[09-06|04:06:59.516] Bad discv4 packet addr=18.196.105.45:30304 err="bad hash" DEBUG[09-06|04:06:59.868] Bad discv4 packet addr=212.164.219.68:42786 err="bad hash" DEBUG[09-06|04:06:59.942] Bad discv4 packet addr=47.94.37.139:30303 err="bad hash" DEBUG[09-06|04:07:00.016] Bad discv4 packet addr=18.196.105.45:30304 err="bad hash" DEBUG[09-06|04:07:00.248] Found seed node in database id=a979fb575495b8d6 addr=52.16.188.185:30303 age=426724h7m0.248403028s DEBUG[09-06|04:07:00.248] Found seed node in database id=3f1d12044546b763 addr=13.93.211.84:30303 age=426724h7m0.248484728s DEBUG[09-06|04:07:00.248] Found seed node in database id=78de8a0916848093 addr=191.235.84.50:30303 age=426724h7m0.248510729s DEBUG[09-06|04:07:00.248] Found seed node in database id=158f8aab45f6d19c addr=13.75.154.138:30303 age=426724h7m0.248536929s DEBUG[09-06|04:07:00.248] Found seed node in database id=1118980bf48b0a36 addr=52.74.57.123:30303 age=426724h7m0.248560929s DEBUG[09-06|04:07:00.248] Found seed node in database id=979b7fa28feeb35a addr=5.1.83.226:30303 age=426724h7m0.248585829s DEBUG[09-06|04:07:00.576] Recalculated downloader QoS values rtt=20s confidence=1.000 ttl=1m0s DEBUG[09-06|04:07:00.751] Bad discv4 packet addr=18.217.220.106:21002 err="bad hash" DEBUG[09-06|04:07:00.986] Bad discv4 packet addr=34.217.105.166:30303 err="bad hash" DEBUG[09-06|04:07:01.108] Bad discv4 packet addr=78.129.229.97:30304 err="bad hash"

FrankSzendzielarz commented 5 years ago

In general the approach should be to specify a custom discovery port and isolate the peers at a network level. By attempting to differentiate your DHT from others while still being available to discovery queries you are setting yourself up as a DoS target.

thomasyuan commented 5 years ago

I guess that’s why I don’t like Etherum any more. Some basic stuff doesn’t work well but they don’t care, even someone already give some reasonable suggestions and being proved solutions. I had a pretty simple PR before which only fix a log, 100% safe and won’t effect the logic but no one will take a look. After waiting there for a month I canceled it. Funny thing is then there is a almost exactly PR from Etherum developers showed up and got merged in maybe 2~3 days. Keep working on those “important” stuff, you should never mind the voice from the community.

karalabe commented 5 years ago

Some basic stuff doesn’t work well but they don’t care, even someone already give some reasonable suggestions and being proved solutions.

It's not reasonable, it's a security issue. When such a network is eclipsed, partitioned and attacked, will you take responsibility for it? Of course not.

I had a pretty simple PR before which only fix a log, 100% safe and won’t effect the logic but no one will take a look. After waiting there for a month I canceled it.

@thomasyuan Your PR was open for a whopping 4 days before you cancelled it and it was a weekend. Luckily GitHub has a history and we can verify facts vs. false statements.

thomasyuan commented 5 years ago

It’s 22 months since this issue was filled, if it is security issue, it should be closed maybe in one month (is that reasonable?)

Sorry I didn’t check the history(It was weekend, from Friday to Tuesday.), my bad. But if we treat the PR like my complaint comment (response in minutes, not necessary to merge in minutes), I even don’t have time to cancel it.

No more comments.