Open ko0f opened 2 years ago
hi @ko0f, could you check this docker file? I did it on my end and it works.
Yes @Vovchyk, that's the one I use.
Internet works, I have another RSK node with older version which works on the same network.
What possible reasons for instant disconnect are there?
@ko0f, I can think of some issue with your network / firewall or docker configuration. First of all, I'd try to run rskj
node outside docker to see if it's related to your network. Btw, you can use net_peerCount
/ net_peerList
JSON_RPC methods to see how many peers have been connected, e.g. curl -X POST 'http://localhost:4444' -H 'Content-Type: application/json' --data '{ "jsonrpc":"2.0", "method":"net_peerCount", "params":[], "id":1 }'
. web3_clientVersion
and net_version
are also handy api methods.
Peer count {"jsonrpc":"2.0","id":67,"result":"0x0"}
Client version RskJ/3.2.0/Linux/Java1.8/IRIS-a4e57af
Net version 0
Other docker containers seem to work well. I've also opened a shell for RSK docker docker run -it -v "/rsk:/rsk" rsk:latest /bin/bash
and pinged 1.1.1.1
, got a pong.
Here's a sample of the log -
2022-03-10-12:43:39.634 INFO [sync] Connecting to: bootstrap09.rsk.co:5050
2022-03-10-12:43:39.651 INFO [sync] Connecting to: bootstrap10.rsk.co:5050
2022-03-10-12:43:39.672 INFO [sync] Connecting to: bootstrap03.rsk.co:5050
2022-03-10-12:43:39.688 INFO [sync] Connecting to: bootstrap15.rsk.co:5050
2022-03-10-12:43:40.064 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:40.151 DEBUG [peerScoring] Recorded SUCCESSFUL_HANDSHAKE. NodeID{4e906070623d61dc9d1c5c249f092541516168894db8d2fe277d2a6fe08b65642dfa7c756cbae95543b867eb475496616ab314652c8b0eb64474878a765491e8}, Address: 147.182.152.57
2022-03-10-12:43:40.172 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:40.172 DEBUG [syncprocessor] Receiving syncState from node NodeID{4e906070623d61dc9d1c5c249f092541516168894db8d2fe277d2a6fe08b65642dfa7c756cbae95543b867eb475496616ab314652c8b0eb64474878a765491e8} block 4145354 549f552b968de392a6e8feb0d7c20608c85b80514fcdb8a202160d366140026e
2022-03-10-12:43:40.172 DEBUG [messageProcess] Message[STATUS_MESSAGE] processed after [0.000061] seconds.
2022-03-10-12:43:40.424 INFO [sync] Peer 4e906070623d61dc9d1c5c249f092541516168894db8d2fe277d2a6fe08b65642dfa7c756cbae95543b867eb475496616ab314652c8b0eb64474878a765491e8: added to pool
2022-03-10-12:43:41.041 INFO [sync] Peer 4e906070623d61dc9d1c5c249f092541516168894db8d2fe277d2a6fe08b65642dfa7c756cbae95543b867eb475496616ab314652c8b0eb64474878a765491e8: disconnected
2022-03-10-12:43:41.172 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:42.173 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:42.702 DEBUG [discover] 14 Nodes retrieved from the PE.
2022-03-10-12:43:42.702 INFO [sync] Connecting to: bootstrap03.rsk.co:5050
2022-03-10-12:43:42.702 INFO [sync] Connecting to: bootstrap14.rsk.co:5050
2022-03-10-12:43:42.714 INFO [sync] Connecting to: bootstrap04.rsk.co:5050
2022-03-10-12:43:42.726 INFO [sync] Connecting to: bootstrap07.rsk.co:5050
2022-03-10-12:43:43.173 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:44.174 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:45.174 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:45.743 DEBUG [discover] 14 Nodes retrieved from the PE.
2022-03-10-12:43:45.743 INFO [sync] Connecting to: bootstrap05.rsk.co:5050
2022-03-10-12:43:45.754 INFO [sync] Connecting to: bootstrap01.rsk.co:5050
2022-03-10-12:43:46.174 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:46.632 DEBUG [peerScoring] Recorded SUCCESSFUL_HANDSHAKE. NodeID{eb08c63444cdb9b71608bee19adc7fa119ec68bf04b96a5718757bc5818a49dc448b8f7e9eff5b7bb60cbc940bd2a72b1e9d0be7567e12eaa83a068ed669c910}, Address: 13.70.177.106
2022-03-10-12:43:46.652 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:46.652 DEBUG [syncprocessor] Receiving syncState from node NodeID{eb08c63444cdb9b71608bee19adc7fa119ec68bf04b96a5718757bc5818a49dc448b8f7e9eff5b7bb60cbc940bd2a72b1e9d0be7567e12eaa83a068ed669c910} block 4145355 79f01ebe551382a88fceca9bf5e3eb5546d8786c7fe2cb76625277997a991d21
2022-03-10-12:43:46.652 DEBUG [messageProcess] Message[STATUS_MESSAGE] processed after [0.000123] seconds.
2022-03-10-12:43:47.653 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:48.653 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:48.768 DEBUG [discover] 14 Nodes retrieved from the PE.
2022-03-10-12:43:48.768 INFO [sync] Connecting to: bootstrap13.rsk.co:5050
2022-03-10-12:43:49.425 DEBUG [peerScoring] Recorded SUCCESSFUL_HANDSHAKE. NodeID{9f27594562d2a11642d8737821a21771abd87e7b2587c6cb4fb778823153aa6e586fdee2b3b31fb999185185259a0b6d150c95db0b3369314fab8f88aeb6b874}, Address: 13.52.239.16
2022-03-10-12:43:49.445 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:49.445 DEBUG [syncprocessor] Receiving syncState from node NodeID{9f27594562d2a11642d8737821a21771abd87e7b2587c6cb4fb778823153aa6e586fdee2b3b31fb999185185259a0b6d150c95db0b3369314fab8f88aeb6b874} block 4145355 79f01ebe551382a88fceca9bf5e3eb5546d8786c7fe2cb76625277997a991d21
2022-03-10-12:43:49.446 DEBUG [messageProcess] Message[STATUS_MESSAGE] processed after [0.000059] seconds.
2022-03-10-12:43:50.446 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:51.446 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:51.788 DEBUG [discover] 14 Nodes retrieved from the PE.
2022-03-10-12:43:51.788 INFO [sync] Connecting to: bootstrap02.rsk.co:5050
2022-03-10-12:43:51.806 INFO [sync] Connecting to: bootstrap08.rsk.co:5050
2022-03-10-12:43:51.873 INFO [sync] Connecting to: bootstrap12.rsk.co:5050
2022-03-10-12:43:52.141 DEBUG [peerScoring] Recorded SUCCESSFUL_HANDSHAKE. NodeID{509af6b66c0c20cded9fe73024225731ba961e85cc241878a84f03c1c2e0614f0cbf5eea104e140d936a59db84aa5f95cffdd1ce018d4e8fdde22f514452a247}, Address: 34.252.232.189
2022-03-10-12:43:52.162 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:52.162 DEBUG [syncprocessor] Receiving syncState from node NodeID{509af6b66c0c20cded9fe73024225731ba961e85cc241878a84f03c1c2e0614f0cbf5eea104e140d936a59db84aa5f95cffdd1ce018d4e8fdde22f514452a247} block 4145355 79f01ebe551382a88fceca9bf5e3eb5546d8786c7fe2cb76625277997a991d21
2022-03-10-12:43:52.162 DEBUG [messageProcess] Message[STATUS_MESSAGE] processed after [0.000063] seconds.
2022-03-10-12:43:52.426 INFO [sync] Peer 509af6b66c0c20cded9fe73024225731ba961e85cc241878a84f03c1c2e0614f0cbf5eea104e140d936a59db84aa5f95cffdd1ce018d4e8fdde22f514452a247: added to pool
2022-03-10-12:43:52.536 DEBUG [peerScoring] Recorded SUCCESSFUL_HANDSHAKE. NodeID{34e3a8552f8af05ce564acfab0b1754774f9ed4c1e0413f179b94026a450f9af17b9a92b1fe8ba26ea51f59787bf88bf6592cf6d48e624da13392da17af94978}, Address: 13.56.115.33
2022-03-10-12:43:52.564 INFO [sync] Peer 509af6b66c0c20cded9fe73024225731ba961e85cc241878a84f03c1c2e0614f0cbf5eea104e140d936a59db84aa5f95cffdd1ce018d4e8fdde22f514452a247: disconnected
2022-03-10-12:43:52.571 DEBUG [peerScoring] Recorded SUCCESSFUL_HANDSHAKE. NodeID{8b8f07ba446f9a5961f307977dc4f3b95368dbbcbca1dbdd5b115fc195eb92cdf49b98cb55da00253d654b84e8f3edde18e5280b54f1ccae20f84381a3bd7368}, Address: 191.232.162.215
2022-03-10-12:43:52.590 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:52.590 DEBUG [syncprocessor] Receiving syncState from node NodeID{8b8f07ba446f9a5961f307977dc4f3b95368dbbcbca1dbdd5b115fc195eb92cdf49b98cb55da00253d654b84e8f3edde18e5280b54f1ccae20f84381a3bd7368} block 4145355 79f01ebe551382a88fceca9bf5e3eb5546d8786c7fe2cb76625277997a991d21
2022-03-10-12:43:52.590 DEBUG [messageProcess] Message[STATUS_MESSAGE] processed after [0.000061] seconds.
2022-03-10-12:43:52.746 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:52.746 DEBUG [syncprocessor] Receiving syncState from node NodeID{34e3a8552f8af05ce564acfab0b1754774f9ed4c1e0413f179b94026a450f9af17b9a92b1fe8ba26ea51f59787bf88bf6592cf6d48e624da13392da17af94978} block 4145355 79f01ebe551382a88fceca9bf5e3eb5546d8786c7fe2cb76625277997a991d21
2022-03-10-12:43:52.746 DEBUG [messageProcess] Message[STATUS_MESSAGE] processed after [0.000127] seconds.
2022-03-10-12:43:53.426 INFO [sync] Peer 34e3a8552f8af05ce564acfab0b1754774f9ed4c1e0413f179b94026a450f9af17b9a92b1fe8ba26ea51f59787bf88bf6592cf6d48e624da13392da17af94978: added to pool
2022-03-10-12:43:53.485 INFO [sync] Peer 34e3a8552f8af05ce564acfab0b1754774f9ed4c1e0413f179b94026a450f9af17b9a92b1fe8ba26ea51f59787bf88bf6592cf6d48e624da13392da17af94978: disconnected
2022-03-10-12:43:53.746 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:54.747 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:54.882 DEBUG [discover] 14 Nodes retrieved from the PE.
2022-03-10-12:43:54.882 INFO [sync] Connecting to: bootstrap08.rsk.co:5050
2022-03-10-12:43:54.883 INFO [sync] Connecting to: bootstrap12.rsk.co:5050
2022-03-10-12:43:55.056 DEBUG [peerScoring] Recorded FAILED_HANDSHAKE. NodeID{509af6b66c0c20cded9fe73024225731ba961e85cc241878a84f03c1c2e0614f0cbf5eea104e140d936a59db84aa5f95cffdd1ce018d4e8fdde22f514452a247}, Address: 34.252.232.189
2022-03-10-12:43:55.747 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:56.747 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:57.748 DEBUG [messageProcess] Queued Messages: 0
2022-03-10-12:43:57.883 DEBUG [discover] 14 Nodes retrieved from the PE.
2022-03-10-12:43:58.748 DEBUG [messageProcess] Queued Messages: 0
This may have been an unidentified connectivity issue, it stopped happening.
This happens again now with one of my nodes - I have two, one is working fine, the other fails to find peers for hours now. Nothing in configuration has changed, I've just restarted the server.
Last time it managed to find peers after very long time, even so - it's an issue if it sometimes takes too long to connect to peers.
$ curl -X POST -H "Content-Type: application/json" --data '{"jsonrpc":"2.0","method":"net_peerCount","params":[],"id":67}' http://localhost:4444
{"jsonrpc":"2.0","id":67,"result":"0x0"}
@ko0f , recently we indeed have had some temporary issues with peers connectivity. Please let us know if you're still facing it.
Seems so, as long as the node is up, all is well. Once restarted it takes a very long time to find peers and some times (like just happened yesterday) it doesn't find any peers and keeps attempting bootstrap nodes, all while the other node I have is up and synced.
I've restarted the node yesterday and now, after a day it managed to find 2 peers and got synced. Just need to avoid restarts. :\
@ko0f thanks for reporting this. We will check this out. This is definitely unexpected behaviour, there should not be such delay in finding peers.
Btw, you're still running the version RskJ/3.2.0/Linux/Java1.8/IRIS-a4e57af
of the node in a docker container in mainnet
?
You mentioned that you run two nodes and the issue happens only to one of them? Are they running on a single machine in docker? Do you use a default bridge
network mode?
Is it true that you are not facing this issue if the node is running outside docker?
Could you please also share details of the environment you use (eg. hardware spec, OS, java version)?
@Vovchyk this happens on both machines, once rskj is restarted.
Dockerfile -
#
# RSK Node Dockerfile
#
# Pull base image.
FROM ubuntu:latest
MAINTAINER RSK Release <support@rsk.co>
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update
RUN apt-get install -y --no-install-recommends openjdk-8-jre supervisor systemd software-properties-common
# Install RSK.
RUN groupadd --gid 888 rsk && useradd rsk -d /var/lib/rsk -s /sbin/nologin --uid=888 --gid=888
RUN \
add-apt-repository -y ppa:rsksmart/rskj && \
apt-get update && \
(echo rskj shared/accepted-rsk-license-v1-1 select true | /usr/bin/debconf-set-selections )&& \
apt-get install -y --no-install-recommends rskj && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# Supervisod CONF
COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf
COPY node.conf /etc/rsk/node.conf
COPY logback.xml /etc/rsk/logback.xml
COPY startrsk /usr/bin/startrsk
RUN chmod +x /usr/bin/startrsk
## MAINNET
EXPOSE 4444
EXPOSE 4445
EXPOSE 5050
# Define default command.
#CMD ["/usr/bin/supervisord"]
CMD ["/usr/bin/startrsk"]
node.conf peer section, which I added during my attempts to fix this -
peer {
discovery = {
enabled = true
ip.list = ["10.0.0.2:5050","bootstrap01.rsk.co:5050","bootstrap02.rsk.co:5050","bootstrap03.rsk.co:5050","bootstrap04.rsk.co:5050","bootstrap05.rsk.co:5050","bootstrap06.rsk.co:5050","bootstrap07.rsk.co:5050","bootstrap08.rsk.co:5050","bootstrap09.rsk.co:5050","bootstrap10.rsk.co:5050","bootstrap11.rsk.co:5050","bootstrap12.rsk.co:5050","bootstrap13.rsk.co:5050","bootstrap14.rsk.co:5050","bootstrap15.rsk.co:5050","bootstrap16.rsk.co:5050","18.166.58.153:5050","8.214.48.216:5050","213.94.62.19:5050","64.225.55.114:5050","52.53.166.254:5050","204.236.246.36:5050"]
bind.ip = 0.0.0.0
}
active = [
]
trusted = [
]
listen.port = 5050
connection.timeout = 5
channel.read.timeout = 30
maxActivePeers = 30
}
Will update to 3.3.0 now and check again.
It's been two hours and still, with 3.3.0 net_peerCount returns {"jsonrpc":"2.0","id":67,"result":"0x0"}
thanks @ko0f for the update. We've been investigating this.
Version 3.2.0 running on docker, fails to connect to any peer, including bootstrap nodes.
startup log -
but then it fails to connect to any peer, just keeps connecting and disconnecting right away -