rand-chain / rust-randchain

(WIP) Official implementation of the RandChain protocol
GNU General Public License v3.0
1 stars 0 forks source link

p2p: nodes cannot connect to other nodes queried from known peers (due to the OS error 111) #81

Closed SebastianElvis closed 3 years ago

SebastianElvis commented 3 years ago

Description

Nodes exchange their known peers regularly. However, in the current implementation, given newly exchanged peers, a node cannot connect to them. The error message shows that Connection refused (os error 111).

Reproduction

In the proto-dco branch (commit f7d1e77714a49be87b523f1132fa8b8702dd6de8), do

$ docker-compose -f ./tools/docker-compose-4.yml up

and wait for a while.

Expectation

Screenshots

Please find the log below.

➜  rust-randchain git:(proto-dco) docker-compose -f ./tools/docker-compose-4.yml up
Creating network "tools_randchain_privnet" with driver "macvlan"
WARNING: Found orphan containers (tools_seednode_1, tools_node_2, tools_node_3, tools_node_1) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up.
Creating tools_node1_1 ... done
Creating tools_node4_1 ... done
Creating tools_node2_1 ... done
Creating tools_node3_1 ... done
Attaching to tools_node1_1, tools_node3_1, tools_node4_1, tools_node2_1
node1_1  | 2020-12-03 14:00:13 UTC TRACE db canonize BestBlock { number: 0, hash: 3d86e3dfab8149f072e31eedb1ef645da7f7970c8e7998d6f96995cdd09cd07b }
node1_1  | 2020-12-03 14:00:13 UTC TRACE p2p::p2p Starting tcp server
node1_1  | 2020-12-03 14:00:13 UTC INFO p2p::p2p Inbound connections: (0/10)
node1_1  | 2020-12-03 14:00:13 UTC INFO p2p::p2p Outbound connections: (0/10)
node1_1  | 2020-12-03 14:00:13 UTC TRACE p2p::p2p Creating 0 more outbound connections
node3_1  | 2020-12-03 14:00:14 UTC TRACE db canonize BestBlock { number: 0, hash: 3d86e3dfab8149f072e31eedb1ef645da7f7970c8e7998d6f96995cdd09cd07b }
node4_1  | 2020-12-03 14:00:14 UTC TRACE db canonize BestBlock { number: 0, hash: 3d86e3dfab8149f072e31eedb1ef645da7f7970c8e7998d6f96995cdd09cd07b }
node3_1  | 2020-12-03 14:00:14 UTC INFO p2p::p2p Inbound connections: (0/10)
node3_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Starting tcp server
node3_1  | 2020-12-03 14:00:14 UTC INFO p2p::p2p Outbound connections: (1/10)
node3_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Creating 0 more outbound connections
node3_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Trying to connect to: 10.0.0.2:8333
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Accepted connection from 10.0.0.4:43634
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received verack message from 10.0.0.4:43634
node3_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Connected to 10.0.0.2:8333
node3_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received getheaders message from 10.0.0.2:8333
node3_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context declared response: 0
node3_1  | 2020-12-03 14:00:14 UTC TRACE p2p::protocol::sync declared response 0 for request: getheaders
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received getaddr message from 10.0.0.4:43634
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context declared response: 0
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context response ready: addr, id: 0, final: true
node3_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context response ready: headers, id: 0, final: true
node3_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received addr message from 10.0.0.2:8333
node3_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Updating node table with 1 entries
node2_1  | 2020-12-03 14:00:14 UTC TRACE db canonize BestBlock { number: 0, hash: 3d86e3dfab8149f072e31eedb1ef645da7f7970c8e7998d6f96995cdd09cd07b }
node4_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Starting tcp server
node4_1  | 2020-12-03 14:00:14 UTC INFO p2p::p2p Inbound connections: (0/10)
node4_1  | 2020-12-03 14:00:14 UTC INFO p2p::p2p Outbound connections: (1/10)
node4_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Creating 0 more outbound connections
node4_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Trying to connect to: 10.0.0.2:8333
node2_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Starting tcp server
node2_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Trying to connect to: 10.0.0.2:8333
node2_1  | 2020-12-03 14:00:14 UTC INFO p2p::p2p Inbound connections: (0/10)
node2_1  | 2020-12-03 14:00:14 UTC INFO p2p::p2p Outbound connections: (1/10)
node2_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Creating 0 more outbound connections
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received getheaders message from 10.0.0.4:43634
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context declared response: 1
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Accepted connection from 10.0.0.3:45638
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::protocol::sync declared response 1 for request: getheaders
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Accepted connection from 10.0.0.5:46496
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context response ready: headers, id: 1, final: true
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received headers message from 10.0.0.4:43634
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received verack message from 10.0.0.5:46496
node2_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Connected to 10.0.0.2:8333
node2_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received getheaders message from 10.0.0.2:8333
node2_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context declared response: 0
node2_1  | 2020-12-03 14:00:14 UTC TRACE p2p::protocol::sync declared response 0 for request: getheaders
node2_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context response ready: headers, id: 0, final: true
node3_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received headers message from 10.0.0.2:8333
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received verack message from 10.0.0.3:45638
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received getaddr message from 10.0.0.3:45638
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received getaddr message from 10.0.0.5:46496
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context declared response: 0
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context response ready: addr, id: 0, final: true
node4_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Connected to 10.0.0.2:8333
node4_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received getheaders message from 10.0.0.2:8333
node4_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context declared response: 0
node4_1  | 2020-12-03 14:00:14 UTC TRACE p2p::protocol::sync declared response 0 for request: getheaders
node4_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context response ready: headers, id: 0, final: true
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received getheaders message from 10.0.0.5:46496
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context declared response: 1
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::protocol::sync declared response 1 for request: getheaders
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received headers message from 10.0.0.5:46496
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context declared response: 0
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context response ready: addr, id: 0, final: true
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context response ready: headers, id: 1, final: true
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received getheaders message from 10.0.0.3:45638
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context declared response: 1
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::protocol::sync declared response 1 for request: getheaders
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::net::peer_context response ready: headers, id: 1, final: true
node1_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received headers message from 10.0.0.3:45638
node2_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received addr message from 10.0.0.2:8333
node2_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Updating node table with 3 entries
node2_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received headers message from 10.0.0.2:8333
node4_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received addr message from 10.0.0.2:8333
node4_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Updating node table with 3 entries
node4_1  | 2020-12-03 14:00:14 UTC TRACE p2p::p2p Received headers message from 10.0.0.2:8333
node1_1  | 2020-12-03 14:00:23 UTC INFO p2p::p2p Inbound connections: (3/10)
node1_1  | 2020-12-03 14:00:23 UTC INFO p2p::p2p Outbound connections: (0/10)
node1_1  | 2020-12-03 14:00:23 UTC TRACE p2p::p2p Creating 0 more outbound connections
node3_1  | 2020-12-03 14:00:24 UTC INFO p2p::p2p Inbound connections: (0/10)
node3_1  | 2020-12-03 14:00:24 UTC INFO p2p::p2p Outbound connections: (1/10)
node3_1  | 2020-12-03 14:00:24 UTC TRACE p2p::p2p Creating 1 more outbound connections
node3_1  | 2020-12-03 14:00:24 UTC TRACE p2p::p2p Trying to connect to: 10.0.0.4:43634
node3_1  | 2020-12-03 14:00:24 UTC TRACE p2p::p2p Unable to connect to 10.0.0.4:43634: Connection refused (os error 111)
node2_1  | 2020-12-03 14:00:24 UTC INFO p2p::p2p Inbound connections: (0/10)
node2_1  | 2020-12-03 14:00:24 UTC INFO p2p::p2p Outbound connections: (1/10)
node2_1  | 2020-12-03 14:00:24 UTC TRACE p2p::p2p Creating 3 more outbound connections
node2_1  | 2020-12-03 14:00:24 UTC TRACE p2p::p2p Trying to connect to: 10.0.0.5:46496
node2_1  | 2020-12-03 14:00:24 UTC TRACE p2p::p2p Trying to connect to: 10.0.0.3:45638
node2_1  | 2020-12-03 14:00:24 UTC TRACE p2p::p2p Trying to connect to: 10.0.0.4:43634
node2_1  | 2020-12-03 14:00:24 UTC TRACE p2p::p2p Unable to connect to 10.0.0.5:46496: Connection refused (os error 111)
node2_1  | 2020-12-03 14:00:24 UTC TRACE p2p::p2p Unable to connect to 10.0.0.3:45638: Connection refused (os error 111)
node2_1  | 2020-12-03 14:00:24 UTC TRACE p2p::p2p Unable to connect to 10.0.0.4:43634: Connection refused (os error 111)
node4_1  | 2020-12-03 14:00:24 UTC INFO p2p::p2p Inbound connections: (0/10)
node4_1  | 2020-12-03 14:00:24 UTC INFO p2p::p2p Outbound connections: (1/10)
node4_1  | 2020-12-03 14:00:24 UTC TRACE p2p::p2p Creating 3 more outbound connections
node4_1  | 2020-12-03 14:00:24 UTC TRACE p2p::p2p Trying to connect to: 10.0.0.5:46496
node4_1  | 2020-12-03 14:00:24 UTC TRACE p2p::p2p Trying to connect to: 10.0.0.3:45638
node4_1  | 2020-12-03 14:00:24 UTC TRACE p2p::p2p Trying to connect to: 10.0.0.4:43634
node4_1  | 2020-12-03 14:00:24 UTC TRACE p2p::p2p Unable to connect to 10.0.0.4:43634: Connection refused (os error 111)
node4_1  | 2020-12-03 14:00:24 UTC TRACE p2p::p2p Unable to connect to 10.0.0.3:45638: Connection refused (os error 111)
node4_1  | 2020-12-03 14:00:24 UTC TRACE p2p::p2p Unable to connect to 10.0.0.5:46496: Connection refused (os error 111)
^CGracefully stopping... (press Ctrl+C again to force)
Stopping tools_node4_1 ... done
Stopping tools_node3_1 ... done
Stopping tools_node2_1 ... done
Stopping tools_node1_1 ... done

Environment

Additional context

OS error 111 means Connection refused, according to https://mariadb.com/kb/en/operating-system-error-codes/

Resolution

SebastianElvis commented 3 years ago

The problem seems to be that the ports of endpoints inserted to the NodeTable are incorrect. These are outbound ports rather than the inbound one (i.e., the standard P2P port 8333)

image

SebastianElvis commented 3 years ago

This is the nodes.csv file from parity-bitcoin. This should be the expected behaviour.

image

SebastianElvis commented 3 years ago

Format of nodes.csv:

IP:Port Timestamp Supported_services Failures

See p2p/src/util/node_table.rs

SebastianElvis commented 3 years ago

Pretty sure it's because of the wrong ports. I print the addr messages and here is the log. The port should be 8333 rather than 49588.

image

I'll check how to fix this issue.

0xmountaintop commented 3 years ago

I think that's expected behavior. Even running pbtc in real network it shows the following ports if using - c

Node1 listening to all on 8333: image

Node2 connecting to Node1, listening on 9997: image

Node3 connecting to Node1, listening on 9999: image

0xmountaintop commented 3 years ago

To me it's more like a bug of pbtc itself: a node doesn't announce its listening port to its peers.

SebastianElvis commented 3 years ago

This is consistent to my observation. I compared the p2p modules in both projects and found no difference in terms of the session/messaging protocols.

If this is the case, then

0xmountaintop commented 3 years ago
  • How can we fix it? I guess for prototyping we can just enforce the default port

That will require a subprotocol. Let's use default port for experiment first.

SebastianElvis commented 3 years ago
  • How can we fix it? I guess for prototyping we can just enforce the default port

That will require a subprotocol. Let's use default port for experiment first.

Agree. For future implementation perhaps we should check how other implementations deal with relevant messages.

SebastianElvis commented 3 years ago

For production, there is an alternate fix for this issue. When initiating the p2p object, the node inserts its address (IP+port) to the node table. When accepting connections, nodes should not add other nodes' addresses to the node table, as at this stage nodes cannot know others' ports for inbound connections. When in the addr protocol, nodes exchange their node tables, which include addresses of themselves.

0xmountaintop commented 3 years ago

For production, there is an alternate fix for this issue. When initiating the p2p object, the node inserts its address (IP+port) to the node table. When accepting connections, nodes should not add other nodes' addresses to the node table, as at this stage nodes cannot know others' ports for inbound connections. When in the addr protocol, nodes exchange their node tables, which include addresses of themselves.

Then in the addr protocol, we need to insert exchanged node table contents into local node table. However, we may still need to define a expiry mechanism for cleaning node table incase it bloats.

SebastianElvis commented 3 years ago

For production, there is an alternate fix for this issue. When initiating the p2p object, the node inserts its address (IP+port) to the node table. When accepting connections, nodes should not add other nodes' addresses to the node table, as at this stage nodes cannot know others' ports for inbound connections. When in the addr protocol, nodes exchange their node tables, which include addresses of themselves.

Then in the addr protocol, we need to insert exchanged node table contents into local node table. However, we may still need to define a expiry mechanism for cleaning node table incase it bloats.

I guess this has been implemented. In the node table implementation, there is an item timestamp specifying when the address is queried. Will double check