Open fish-sammy opened 4 years ago
Turned on debug logging for sync module and I can't find any problem in there other than it is deactivating peers all the time?
2020-02-15 07:31:09 UTC IO Worker #0 DEBUG sync Peer 79 timeout while we were asking them for ForkHeader; disconnecting.
2020-02-15 07:31:09 UTC IO Worker #0 DEBUG sync Disconnected 79
2020-02-15 07:31:09 UTC IO Worker #2 DEBUG sync Deactivating peer 59
2020-02-15 07:31:09 UTC IO Worker #2 DEBUG sync Connected 10:Geth/v1.9.7-stable-a718daa6/linux-amd64/go1.13.4
2020-02-15 07:31:10 UTC IO Worker #1 DEBUG sync Deactivating peer 43
2020-02-15 07:31:17 UTC IO Worker #0 INFO import Syncing #9485532 0xfce6…4249 0.06 blk/s 2.0 tx/s 0.1 Mgas/s 0+ 609 Qed #9486145 23/25 peers 30 MiB chain 24 GiB db 51 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 454 µs
2020-02-15 07:31:25 UTC IO Worker #1 INFO import Syncing #9485532 0xfce6…4249 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 609 Qed #9486145 24/25 peers 31 MiB chain 24 GiB db 51 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 454 µs
2020-02-15 07:31:33 UTC IO Worker #3 INFO import Syncing #9485533 0x5925…2ea8 0.06 blk/s 2.1 tx/s 0.2 Mgas/s 0+ 609 Qed #9486145 24/25 peers 31 MiB chain 24 GiB db 51 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 454 µs
2020-02-15 07:31:41 UTC IO Worker #0 INFO import Syncing #9485534 0x68b7…e3a3 0.06 blk/s 2.1 tx/s 0.2 Mgas/s 0+ 609 Qed #9486145 24/25 peers 35 MiB chain 24 GiB db 51 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 454 µs
2020-02-15 07:31:50 UTC IO Worker #1 INFO import Syncing #9485534 0x68b7…e3a3 0.06 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 609 Qed #9486145 24/25 peers 35 MiB chain 24 GiB db 51 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 454 µs
2020-02-15 07:31:58 UTC IO Worker #3 INFO import Syncing #9485534 0x68b7…e3a3 0.12 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 609 Qed #9486145 24/25 peers 35 MiB chain 24 GiB db 51 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 454 µs
2020-02-15 07:31:58 UTC IO Worker #0 INFO import Syncing #9485535 0xd326…aa8e 0.06 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 609 Qed #9486145 24/25 peers 35 MiB chain 24 GiB db 51 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 454 µs
2020-02-15 07:32:14 UTC IO Worker #0 INFO import Syncing #9485535 0xd326…aa8e 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 609 Qed #9486145 24/25 peers 40 MiB chain 24 GiB db 51 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 454 µs
2020-02-15 07:32:14 UTC IO Worker #3 INFO import Syncing #9485535 0xd326…aa8e 0.06 blk/s 9.3 tx/s 0.6 Mgas/s 0+ 609 Qed #9486145 24/25 peers 41 MiB chain 24 GiB db 51 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 454 µs
2020-02-15 07:32:31 UTC IO Worker #3 INFO import Syncing #9485536 0xbe02…1efd 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 609 Qed #9486145 24/25 peers 42 MiB chain 24 GiB db 51 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 454 µs
2020-02-15 07:32:31 UTC IO Worker #0 INFO import Syncing #9485536 0xbe02…1efd 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 609 Qed #9486145 24/25 peers 43 MiB chain 24 GiB db 51 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 454 µs
2020-02-15 07:32:31 UTC IO Worker #3 DEBUG sync Deactivating peer 69
2020-02-15 07:32:32 UTC IO Worker #0 DEBUG sync Deactivating peer 62
2020-02-15 07:32:32 UTC IO Worker #2 DEBUG sync Deactivating peer 35
2020-02-15 07:32:33 UTC IO Worker #3 DEBUG sync Deactivating peer 60
2020-02-15 07:32:55 UTC IO Worker #3 INFO import Syncing #9485537 0xbdd8…3dee 0.06 blk/s 5.8 tx/s 0.3 Mgas/s 0+ 597 Qed #9486145 24/25 peers 43 MiB chain 24 GiB db 50 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 454 µs
2020-02-15 07:32:55 UTC IO Worker #0 INFO import Syncing #9485536 0xbe02…1efd 0.06 blk/s 5.8 tx/s 0.3 Mgas/s 0+ 597 Qed #9486145 24/25 peers 43 MiB chain 24 GiB db 50 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 454 µs
2020-02-15 07:32:55 UTC IO Worker #2 INFO import Syncing #9485537 0xbdd8…3dee 0.06 blk/s 5.7 tx/s 0.3 Mgas/s 0+ 597 Qed #9486145 24/25 peers 43 MiB chain 24 GiB db 50 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 454 µs
2020-02-15 07:33:04 UTC IO Worker #3 DEBUG sync Disconnected 10
2020-02-15 07:33:04 UTC IO Worker #0 DEBUG sync Deactivating peer 9
2020-02-15 07:33:04 UTC IO Worker #3 DEBUG sync Deactivating peer 28
2020-02-15 07:33:04 UTC IO Worker #1 DEBUG sync Deactivating peer 32
2020-02-15 07:33:05 UTC IO Worker #2 DEBUG sync Deactivating peer 59
2020-02-15 07:33:05 UTC IO Worker #2 DEBUG sync Deactivating peer 90
2020-02-15 07:33:06 UTC IO Worker #3 DEBUG sync Deactivating peer 68
2020-02-15 07:33:06 UTC IO Worker #0 DEBUG sync Deactivating peer 59
2020-02-15 07:33:06 UTC IO Worker #3 DEBUG sync Deactivating peer 72
2020-02-15 07:33:08 UTC IO Worker #3 DEBUG sync Deactivating peer 10
2020-02-15 07:33:10 UTC IO Worker #3 DEBUG sync Deactivating peer 35
2020-02-15 07:33:10 UTC IO Worker #3 DEBUG sync Deactivating peer 7
2020-02-15 07:33:11 UTC IO Worker #0 DEBUG sync Deactivating peer 22
2020-02-15 07:33:11 UTC IO Worker #0 DEBUG sync Deactivating peer 45
2020-02-15 07:33:11 UTC IO Worker #0 DEBUG sync Deactivating peer 28
2020-02-15 07:33:12 UTC IO Worker #2 DEBUG sync Deactivating peer 78
2020-02-15 07:33:12 UTC IO Worker #3 DEBUG sync Deactivating peer 72
2020-02-15 07:33:12 UTC IO Worker #2 DEBUG sync Deactivating peer 67
2020-02-15 07:33:13 UTC IO Worker #3 DEBUG sync Deactivating peer 65
2020-02-15 07:33:13 UTC IO Worker #1 DEBUG sync Connected 90:Geth/v1.8.21-stable-ca0ce3e9/linux-amd64/go1.11.2
2020-02-15 07:33:13 UTC IO Worker #0 DEBUG sync Disconnected 90
2020-02-15 07:33:13 UTC IO Worker #2 DEBUG sync Connected 69:Geth/v1.8.21-stable-ca0ce3e9/linux-amd64/go1.11.2
2020-02-15 07:33:13 UTC IO Worker #0 DEBUG sync Disconnected 69
2020-02-15 07:33:14 UTC IO Worker #2 DEBUG sync Deactivating peer 59
2020-02-15 07:33:15 UTC IO Worker #2 DEBUG sync Deactivating peer 69
2020-02-15 07:33:15 UTC IO Worker #2 DEBUG sync Connected 67:Geth/v1.9.8-unstable-8c1e8de8-20191125/linux-amd64/go1.13
2020-02-15 07:33:19 UTC IO Worker #0 INFO import Syncing #9485537 0xbdd8…3dee 0.04 blk/s 1.0 tx/s 0.0 Mgas/s 0+ 597 Qed #9486145 24/25 peers 44 MiB chain 24 GiB db 50 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 248 µs
2020-02-15 07:33:19 UTC IO Worker #3 INFO import Syncing #9485538 0xcbe1…715e 0.04 blk/s 1.0 tx/s 0.0 Mgas/s 0+ 597 Qed #9486145 23/25 peers 46 MiB chain 24 GiB db 50 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 248 µs
2020-02-15 07:33:19 UTC IO Worker #2 INFO import Syncing #9485538 0xcbe1…715e 0.04 blk/s 1.0 tx/s 0.0 Mgas/s 0+ 597 Qed #9486145 24/25 peers 46 MiB chain 24 GiB db 50 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 248 µs
2020-02-15 07:33:28 UTC IO Worker #0 WARN sync 67 -> GetNodeData: item 0/2 – slow state fetch for hash 0xb7e0794e3682be3c65cee8b3fcafd510fdebab7572b7e256ca22cf8ccfa7725d; took Instant { tv_sec: 6422723, tv_nsec: 831792813 }
2020-02-15 07:33:35 UTC IO Worker #2 DEBUG sync Disconnected 67
2020-02-15 07:33:35 UTC IO Worker #0 DEBUG sync Deactivating peer 6
2020-02-15 07:33:35 UTC IO Worker #2 DEBUG sync Deactivating peer 9
2020-02-15 07:33:35 UTC IO Worker #0 DEBUG sync Deactivating peer 29
2020-02-15 07:33:36 UTC IO Worker #0 DEBUG sync Deactivating peer 68
2020-02-15 07:33:36 UTC IO Worker #0 DEBUG sync Deactivating peer 26
2020-02-15 07:33:36 UTC IO Worker #3 WARN sync 67 -> GetNodeData: item 0/2 – slow state fetch for hash 0x77a74e359f735aa00acd12839ec0eacba67d51eb7d932294bdf41f998f33ae23; took Instant { tv_sec: 6422728, tv_nsec: 186277804 }
2020-02-15 07:33:36 UTC IO Worker #3 DEBUG sync "Error sending nodes: Expired"
2020-02-15 07:33:36 UTC IO Worker #3 DEBUG sync Deactivating peer 61
2020-02-15 07:33:36 UTC IO Worker #3 DEBUG sync Deactivating peer 62
2020-02-15 07:33:38 UTC IO Worker #1 DEBUG sync Deactivating peer 69
2020-02-15 07:33:39 UTC IO Worker #2 DEBUG sync Deactivating peer 71
2020-02-15 07:33:41 UTC IO Worker #3 DEBUG sync Connected 62:Geth/v1.9.9-stable-01744997/windows-amd64/go1.13.4
2020-02-15 07:33:42 UTC IO Worker #3 DEBUG sync Deactivating peer 12
2020-02-15 07:33:44 UTC IO Worker #3 INFO import Syncing #9485538 0xcbe1…715e 0.04 blk/s 0.7 tx/s 0.1 Mgas/s 0+ 597 Qed #9486145 24/25 peers 46 MiB chain 24 GiB db 50 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 248 µs
2020-02-15 07:33:44 UTC IO Worker #2 INFO import Syncing #9485539 0xdf08…16f0 0.04 blk/s 0.7 tx/s 0.1 Mgas/s 0+ 597 Qed #9486145 23/25 peers 48 MiB chain 24 GiB db 50 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 248 µs
2020-02-15 07:33:44 UTC IO Worker #0 INFO import Syncing #9485539 0xdf08…16f0 0.04 blk/s 0.7 tx/s 0.1 Mgas/s 0+ 597 Qed #9486145 24/25 peers 48 MiB chain 24 GiB db 50 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 248 µs
2020-02-15 07:33:46 UTC IO Worker #2 DEBUG sync Disconnected 73
2020-02-15 07:33:48 UTC IO Worker #0 DEBUG sync Deactivating peer 69
2020-02-15 07:33:49 UTC IO Worker #3 DEBUG sync Deactivating peer 76
2020-02-15 07:33:49 UTC IO Worker #0 DEBUG sync Deactivating peer 17
2020-02-15 07:33:50 UTC IO Worker #2 DEBUG sync Deactivating peer 73
2020-02-15 07:33:50 UTC IO Worker #2 DEBUG sync Deactivating peer 22
2020-02-15 07:33:50 UTC IO Worker #3 DEBUG sync Deactivating peer 76
2020-02-15 07:33:51 UTC IO Worker #1 DEBUG sync Deactivating peer 18
2020-02-15 07:33:51 UTC IO Worker #3 DEBUG sync Deactivating peer 83
2020-02-15 07:33:52 UTC IO Worker #0 DEBUG sync Deactivating peer 35
2020-02-15 07:33:53 UTC IO Worker #1 DEBUG sync Deactivating peer 75
2020-02-15 07:33:53 UTC IO Worker #1 DEBUG sync Deactivating peer 73
2020-02-15 07:33:53 UTC IO Worker #2 DEBUG sync Connected 79:Geth/v1.9.10-stable-58cf5686/windows-amd64/go1.13.6
2020-02-15 07:34:08 UTC IO Worker #2 INFO import Syncing #9485539 0xdf08…16f0 0.04 blk/s 2.4 tx/s 0.2 Mgas/s 0+ 597 Qed #9486145 23/25 peers 48 MiB chain 24 GiB db 50 MiB queue 901 KiB sync RPC: 0 conn, 0 req/s, 248 µs
2020-02-15 07:34:08 UTC IO Worker #3 INFO import Syncing #9485540 0xdd5b…a8ff 0.04 blk/s 2.4 tx/s 0.2 Mgas/s 0+ 597 Qed #9486145 24/25 peers 48 MiB chain 24 GiB db 50 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 248 µs
2020-02-15 07:34:08 UTC IO Worker #0 INFO import Syncing #9485540 0xdd5b…a8ff 0.04 blk/s 2.4 tx/s 0.2 Mgas/s 0+ 597 Qed #9486145 24/25 peers 48 MiB chain 24 GiB db 50 MiB queue 902 KiB sync RPC: 0 conn, 0 req/s, 248 µs
I started another node to re-sync the entire chain with the exact same pod specs/docker image/configuration file, and weirdly it is able to utilize more CPUs and sync initially.
eth-parity-2 5246m 2882Mi
and
2020-02-15 08:00:14 UTC IO Worker #0 INFO import Syncing #374270 0x1e46…4939 1343.40 blk/s 1595.6 tx/s 43.9 Mgas/s 9527+11007 Qed #394791 23/25 peers 215 MiB chain 2 GiB db 50 MiB queue 50 MiB sync RPC: 0 conn, 0 req/s, 112 µs
However, it ran into same issue after a while and now it looks like the same as my original nodes
eth-parity-2 1059m 24888Mi
and
2020-02-15 14:41:09 UTC Syncing #1140796 0xd203…be2c 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+24181 Qed #1164977 22/25 peers 582 MiB chain 12 GiB db 121 MiB queue 69 MiB sync RPC: 0 conn, 0 req/s, 107 µs
2020-02-15 14:41:29 UTC Syncing #1140796 0xd203…be2c 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+24181 Qed #1164977 22/25 peers 582 MiB chain 12 GiB db 121 MiB queue 69 MiB sync RPC: 0 conn, 0 req/s, 107 µs
2020-02-15 14:41:29 UTC Syncing #1140796 0xd203…be2c 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+24181 Qed #1164977 22/25 peers 582 MiB chain 12 GiB db 121 MiB queue 69 MiB sync RPC: 0 conn, 0 req/s, 107 µs
2020-02-15 14:41:29 UTC Syncing #1140796 0xd203…be2c 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+24181 Qed #1164977 22/25 peers 582 MiB chain 12 GiB db 121 MiB queue 69 MiB sync RPC: 0 conn, 0 req/s, 107 µs
2020-02-15 14:41:29 UTC Syncing #1140796 0xd203…be2c 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+24181 Qed #1164977 22/25 peers 582 MiB chain 12 GiB db 121 MiB queue 69 MiB sync RPC: 0 conn, 0 req/s, 107 µ
Not sure if it is relevant at all, but I use a custom Dockerfile which just downloads the executable as following
FROM debian:buster-slim as runner
ENV VERSION 2.7.2
ENV SHA256SUM fe992f0c9b229a4406e82b9ff6d388f4acb4d6ce2782cb79b7bc379e7965ae34
# show backtraces
ENV RUST_BACKTRACE 1
RUN set -ex \
&& apt-get update \
&& apt-get install -qq --no-install-recommends ca-certificates wget curl \
&& rm -rf /var/lib/apt/lists/*
RUN set -ex \
&& wget -q "https://releases.parity.io/ethereum/v${VERSION}/x86_64-unknown-linux-gnu/parity" \
&& echo "${SHA256SUM} parity" | sha256sum -c \
&& chmod u+x parity \
&& mv parity /usr/local/bin/ \
&& apt-get remove -qq wget
COPY ./config.testnet.toml /root/config.testnet.toml
COPY ./config.toml /root/config.toml
COPY ./docker-entrypoint.sh /root/entrypoint.sh
COPY ./check-readiness.sh /check-readiness.sh
RUN chmod +x /check-readiness.sh
COPY ./check-liveness.sh /check-liveness.sh
RUN chmod +x /check-liveness.sh
EXPOSE 9091 9092
ENTRYPOINT ["/root/entrypoint.sh"]
The GetNodeData
entries in the logs are likely Geth nodes fast syncing off your node (it could also be light clients but that seems much less likely); the slow database fetches are really very very slow. Never seen anything that bad tbh.
I would definitely try with a shorter pruning history, 20 000 blocks is very high. Are you sure you need that much state?
Even if you do need the state for the high number of old blocks, as an experiment, I'd dial down that to, say, 100, and also remove the increased verifier threads and the extra cache settings, we've seen reports on memory leaks in some configurations. In other words, try with a config as close to the defaults as possible and let us know if the node behaves better. If that helps, then we can turn your settings back on one at a time. Thanks!
@dvdplm Thanks for replying! Unfortunately, due to the nature of the project I am working on, we need to keep 3 days worth of state data to be able to replay transactions. We have been running previous versions of Parity with the same pruning history for almost 2 years and never had any problem with it. With that said, I will try reducing the pruning history and get back to you here. I am just wondering if I reduce the pruning history right now and increase it later, will the node be able to get back to those state data retroactively? Or will it just then continue storing more since the point when I turn it back?
@dvdplm I reduced my configuration to the following as you suggested
# This config should be placed in following path:
# ~/.local/share/io.parity.ethereum/config.toml
[parity]
# Ethereum Mainnet Network
chain = "mainnet"
# Parity continously syncs the chain
mode = "active"
# No updates will be auto-installed
auto_update = "none"
# Disables auto downloading of new releases. Not recommended.
no_download = true
# Blockchain and settings will be stored in /data.
base_path = "/data"
[rpc]
# Allows Cross-Origin Requests from domain '*'.
cors = ["*"]
# JSON-RPC will be listening for connections on IP all.
interface = "all"
# Only selected APIs will be exposed over this interface.
apis = ["web3", "eth", "net", "traces"]
# Allow connections only using specified addresses.
hosts = ["all"]
# JSON-RPC over HTTP will be accessible on port 9091.
port = 9091
[websockets]
# JSON-RPC will be listening for connections on IP all.
interface = "all"
# Allows connecting from Origin 'all'.
origins = ["all"]
# Only selected APIs will be exposed over this interface.
apis = ["pubsub"]
# JSON-RPC over WebSockets will be accessible on port 9092.
port = 9092
[ipc]
# You won't be able to use IPC to interact with Parity.
disable = true
[footprint]
# Compute and Store tracing data. (Enables trace_* APIs).
tracing = "on"
# Prune old state data. Maintains journal overlay - fast but extra 50MB of memory used.
pruning = "fast"
# Will keep up to 2000 old state entries.
pruning_history = 100
The node now is not syncing at all
2020-02-16 04:17:19 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 22/25 peers 8 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:17:37 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 22/25 peers 5 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:17:46 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 22/25 peers 7 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:17:46 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 22/25 peers 8 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:17:54 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 22/25 peers 7 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:18:03 UTC 82 -> GetNodeData: item 0/2 – slow state fetch for hash 0x6886e89b660f08cf1192ea5b1d524bb678ad82c96dcff9149aa5fd0c9dfb7b1d; took Instant { tv_sec: 27684, tv_nsec: 539620713 }
2020-02-16 04:18:03 UTC 82 -> GetNodeData: item 0/178 – slow state fetch for hash 0x424d087e449e866929ffaff1ba29ca6938e652626309900478977dad4d6b5622; took Instant { tv_sec: 27678, tv_nsec: 576652674 }
2020-02-16 04:18:12 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 21/25 peers 5 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:18:12 UTC 82 -> GetNodeData: item 0/2 – slow state fetch for hash 0x6402f32704242f3a5e9fa5fc997ba4eeb42cdb37d22712bfce1770fa60751c40; took Instant { tv_sec: 27688, tv_nsec: 384819723 }
2020-02-16 04:18:21 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 22/25 peers 6 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:18:21 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 21/25 peers 9 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:18:30 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 21/25 peers 6 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:18:48 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 21/25 peers 6 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:18:57 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 21/25 peers 7 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:18:57 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 21/25 peers 7 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:19:06 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 21/25 peers 8 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:19:24 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 21/25 peers 4 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:19:32 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 21/25 peers 5 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:19:32 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 21/25 peers 5 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:19:41 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 21/25 peers 7 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:19:58 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 21/25 peers 7 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:20:07 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 21/25 peers 5 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:20:07 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 21/25 peers 6 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:20:16 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 21/25 peers 7 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
2020-02-16 04:20:34 UTC Syncing #9486815 0x4078…67d2 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 4976 Qed #9491799 21/25 peers 8 MiB chain 24 GiB db 456 MiB queue 7 MiB sync RPC: 0 conn, 0 req/s, 0 µs
Is it possible that my database is corrupted as a result of upgrading from 2.5.13 to 2.7.2 in a way which does not crash Parity but only significantly slows it down? If that is the case, this might be reproducible as both of my nodes are having the same issue.
I have the same problem when i use the docker image from "btccom/parity:v2.5.13-btcpool".here is my config.toml and logs:
# Parity Config Generator
# https://paritytech.github.io/parity-config-generator/
#
# This config should be placed in following path:
# ~/.local/share/io.parity.ethereum/config.toml
[parity]
# Ethereum Main Network (ETH)
chain = "foundation"
# Parity continously syncs the chain
mode = "active"
# Disables auto downloading of new releases. Not recommended.
no_download = true
[ipc]
# You won't be able to use IPC to interact with Parity.
disable = true
#[dapps]
# You won't be able to access any web Dapps.
#disable = true
[rpc]
# JSON-RPC will be listening for connections on IP 0.0.0.0.
interface = "0.0.0.0"
# Allows Cross-Origin Requests from domain '*'.
cors = ["*"]
[mining]
# Account address to receive reward when block is mined.
author = "0xaa185ca60063d8418e0f623c5d39713bcc13041a"
# Blocks that you mine will have this text in extra data field.
# Set a different node-id for each node to ensure results of getwork are unique.
extra_data = "Easy to Mine"
# Prepare a block to seal even when there are no miners connected.
force_sealing = true
# Force the node to author new blocks when a new uncle block is imported.
reseal_on_uncle = true
# New pending block will be created for all transactions (both local and external).
reseal_on_txs = "all"
# New pending block will be created only once per 4000 milliseconds.
reseal_min_period = 4000
# Parity will keep/relay at most 8192 transactions in queue.
tx_queue_size = 8192
tx_queue_per_sender = 128
[network]
# Parity will sync by downloading latest state first. Node will be operational in couple minutes.
warp = true
# Specify a path to a file with peers' enodes to be always connected to.
reserved_peers = "/home/parity/.local/share/io.parity.ethereum/peer.list"
# Parity will try to maintain connection to at least 50 peers.
min_peers = 100
# Parity will maintain at most 200 peers.
max_peers = 200
[misc]
logging = "own_tx=trace,sync=info,chain=info,network=info,miner=trace"
log_file = "/home/parity/.local/share/io.parity.ethereum/parity.log"
[footprint]
# Prune old state data. Maintains journal overlay - fast but extra 50MB of memory used.
pruning = "fast"
# If defined will never use more then 1024MB for all caches. (Overrides other cache settings).
cache_size = 1024
[snapshots]
# Disables automatic periodic snapshots.
disable_periodic = true
2020-02-16 08:31:03.882859050 UTC IO Worker #1 INFO import Syncing #9465825 0xa517…b340 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 167 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 3 req/s, 14 µs
2020-02-16 08:31:08.862218367 UTC IO Worker #1 INFO import Syncing #9465825 0xa517…b340 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 167 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 1 req/s, 14 µs
2020-02-16 08:31:13.867002799 UTC IO Worker #1 INFO import Syncing #9465826 0x54b9…9d14 0.20 blk/s 35.8 tx/s 2.0 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 167 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 2 req/s, 14 µs
2020-02-16 08:31:18.863398454 UTC IO Worker #1 INFO import Syncing #9465826 0x54b9…9d14 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 167 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 1 req/s, 14 µs
2020-02-16 08:31:23.862675134 UTC IO Worker #1 INFO import Syncing #9465826 0x54b9…9d14 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 167 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 2 req/s, 14 µs
2020-02-16 08:31:28.864881371 UTC IO Worker #2 INFO import Syncing #9465827 0x3bab…263e 0.20 blk/s 49.4 tx/s 2.0 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 167 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 2 req/s, 13 µs
2020-02-16 08:31:33.864014689 UTC IO Worker #2 INFO import Syncing #9465827 0x3bab…263e 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 167 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 1 req/s, 14 µs
2020-02-16 08:31:38.869048210 UTC IO Worker #1 INFO import Syncing #9465827 0x3bab…263e 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 167 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 1 req/s, 14 µs
2020-02-16 08:31:43.872461997 UTC IO Worker #1 INFO import Syncing #9465829 0x178e…48e3 0.40 blk/s 50.0 tx/s 2.1 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 166 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 1 req/s, 14 µs
2020-02-16 08:31:48.880003171 UTC IO Worker #1 INFO import Syncing #9465829 0x178e…48e3 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 166 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 1 req/s, 14 µs
2020-02-16 08:31:53.865408752 UTC IO Worker #1 INFO import Syncing #9465829 0x178e…48e3 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 2367 Qed #9468204 97/100 peers 69 MiB chain 166 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 1 req/s, 14 µs
2020-02-16 08:31:58.866663314 UTC IO Worker #2 INFO import Syncing #9465830 0x932f…196f 0.20 blk/s 25.6 tx/s 2.0 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 166 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 2 req/s, 14 µs
2020-02-16 08:32:03.269516973 UTC http.worker00 WARN rpc eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.
2020-02-16 08:32:03.875034170 UTC IO Worker #2 INFO import Syncing #9465830 0x932f…196f 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 166 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 3 req/s, 14 µs
2020-02-16 08:32:08.870466519 UTC IO Worker #2 INFO import Syncing #9465830 0x932f…196f 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 166 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 2 req/s, 14 µs
2020-02-16 08:32:13.879979218 UTC IO Worker #2 INFO import Syncing #9465831 0x645e…de83 0.20 blk/s 42.6 tx/s 2.0 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 167 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 1 req/s, 14 µs
2020-02-16 08:32:18.878494405 UTC IO Worker #1 INFO import Syncing #9465832 0x735e…235f 0.20 blk/s 26.8 tx/s 2.0 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 167 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 2 req/s, 14 µs
2020-02-16 08:32:23.886999404 UTC IO Worker #1 INFO import Syncing #9465832 0x735e…235f 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 167 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 1 req/s, 14 µs
2020-02-16 08:32:28.866493074 UTC IO Worker #1 INFO import Syncing #9465833 0xff0a…abf0 0.20 blk/s 25.6 tx/s 2.0 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 167 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 1 req/s, 14 µs
2020-02-16 08:32:33.870349018 UTC IO Worker #1 INFO import Syncing #9465833 0xff0a…abf0 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 2367 Qed #9468204 98/100 peers 69 MiB chain 167 MiB db 258 MiB queue 1 GiB sync RPC: 0 conn, 1 req/s, 14 µs
the sync speed even cannot catch up mainnet's .
what can i do for that?
I am just wondering if I reduce the pruning history right now and increase it later, will the node be able to get back to those state data retroactively? Or will it just then continue storing more since the point when I turn it back?
No, once pruned there is no way to get the pruned state back; it will, as you say, continue storing more from the point you increase it again.
We have upgraded all of out bootnodes from 2.5 to 2.7 and have not seen these issues, so whatever this is, I don't think it affects everyone (making it trickier to debug ofc).
@sammy1991106 Do you think you can share your rocksdb LOG
file with us? You can find it in <your base path>/chains/ethereum/db/<your account>/overlayrecent/db/LOG
. It's a long shot but maybe you're hitting some kind of pathological scenario wrt to write amplification. :/
@dvdplm In the meanwhile is it doable to warp sync to a specific block X, and then turn on tracing and full sync from there?
I run another parity node like the follows:
ExecStart=/usr/bin/parity --author "***" --mode active --pruning fast --db-compaction ssd --cache-size 1024 --config /etc/parity/config.toml --log-file /root/parity.log
and here is my config.toml
# This config should be placed in following path:
# %AppData%\Parity\Ethereum\config.toml
[rpc]
# JSON-RPC will be listening for connections on IP 0.0.0.0.
interface = "0.0.0.0"
# Only selected APIs will be exposed over this interface.
apis = ["web3", "eth", "pubsub", "net", "parity", "parity_pubsub", "traces", "shh", "shh_pubsub", "personal", "parity_accounts", "secretstore", "rpc", "private"]
the log sa folllows:
2020-02-17 14:30:22 Syncing #9469862 0x3e34…fdce 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 1493 Qed #9471364 35/50 peers 9 MiB chain 90 MiB db 140 MiB queue 13 MiB sync RPC: 0 conn, 3 req/s, 14 µs
2020-02-17 14:30:28 Syncing #9469863 0xce69…1a9b 0.20 blk/s 29.6 tx/s 2.0 Mgas/s 0+ 1493 Qed #9471364 35/50 peers 9 MiB chain 90 MiB db 140 MiB queue 13 MiB sync RPC: 0 conn, 1 req/s, 14 µs
2020-02-17 14:30:33 Syncing #9469864 0x5b65…6aae 0.20 blk/s 11.4 tx/s 0.9 Mgas/s 0+ 1493 Qed #9471364 35/50 peers 9 MiB chain 89 MiB db 140 MiB queue 13 MiB sync RPC: 0 conn, 2 req/s, 14 µs
2020-02-17 14:30:37 Syncing #9469864 0x5b65…6aae 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 1493 Qed #9471364 35/50 peers 9 MiB chain 89 MiB db 140 MiB queue 13 MiB sync RPC: 0 conn, 2 req/s, 14 µs
2020-02-17 14:30:43 Syncing #9469864 0x5b65…6aae 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 1493 Qed #9471364 35/50 peers 9 MiB chain 89 MiB db 140 MiB queue 13 MiB sync RPC: 0 conn, 3 req/s, 14 µs
2020-02-17 14:30:47 Syncing #9469865 0x9d00…5ead 0.20 blk/s 29.2 tx/s 2.0 Mgas/s 0+ 1493 Qed #9471364 35/50 peers 9 MiB chain 90 MiB db 140 MiB queue 13 MiB sync RPC: 0 conn, 1 req/s, 14 µs
2020-02-17 14:30:53 Syncing #9469866 0x0dfa…0905 0.20 blk/s 9.6 tx/s 2.0 Mgas/s 0+ 1493 Qed #9471364 35/50 peers 9 MiB chain 90 MiB db 140 MiB queue 13 MiB sync RPC: 0 conn, 3 req/s, 14 µs
2020-02-17 14:30:55 eth_accounts is deprecated and will be removed in future versions: Account management is being phased out see #9997 for alternatives.
2020-02-17 14:30:58 Syncing #9469867 0x6ef2…f0a9 0.20 blk/s 6.2 tx/s 2.0 Mgas/s 0+ 1493 Qed #9471364 35/50 peers 9 MiB chain 89 MiB db 140 MiB queue 13 MiB sync RPC: 0 conn, 3 req/s, 14 µs
2020-02-17 14:31:03 Syncing #9469867 0x6ef2…f0a9 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 1493 Qed #9471364 35/50 peers 9 MiB chain 89 MiB db 140 MiB queue 13 MiB sync RPC: 0 conn, 1 req/s, 14 µs
2020-02-17 14:31:07 Syncing #9469867 0x6ef2…f0a9 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 1493 Qed #9471364 35/50 peers 9 MiB chain 89 MiB db 140 MiB queue 13 MiB sync RPC: 0 conn, 2 req/s, 14 µs
2020-02-17 14:31:13 Syncing #9469868 0xf944…3f57 0.20 blk/s 29.4 tx/s 2.0 Mgas/s 0+ 1493 Qed #9471364 35/50 peers 9 MiB chain 90 MiB db 140 MiB queue 13 MiB sync RPC: 0 conn, 1 req/s, 14 µs
2020-02-17 14:31:17 Syncing #9469869 0xe821…9f5a 0.20 blk/s 8.2 tx/s 2.0 Mgas/s 0+ 1493 Qed #9471364 35/50 peers 9 MiB chain 91 MiB db 140 MiB queue 13 MiB sync RPC: 0 conn, 1 req/s, 14 µs
2020-02-17 14:31:22 Syncing #9469869 0xe821…9f5a 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 1493 Qed #9471364 35/50 peers 9 MiB chain 91 MiB db 140 MiB queue 13 MiB sync RPC: 0 conn, 1 req/s, 14 µs
2020-02-17 14:31:28 Syncing #9469870 0xd581…2243 0.20 blk/s 20.2 tx/s 1.5 Mgas/s 0+ 1493 Qed #9471364 35/50 peers 9 MiB chain 90 MiB db 140 MiB queue 13 MiB sync RPC: 0 conn, 1 req/s, 14 µs
2020-02-17 14:31:32 Syncing #9469871 0x3a49…797e 0.20 blk/s 11.4 tx/s 2.0 Mgas/s 0+ 1481 Qed #9471364 35/50 peers 9 MiB chain 91 MiB db 139 MiB queue 13 MiB sync RPC: 0 conn, 1 req/s, 14 µs
the parity version is:
root@iZj6c20llck3brqzgtwgh7Z:~# parity --version
Parity Ethereum Client.
version Parity-Ethereum/v2.7.2-stable-2662d19-20200206/x86_64-unknown-linux-gnu/rustc1.41.0
Copyright 2015-2020 Parity Technologies (UK) Ltd.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
By Wood/Paronyan/Kotewicz/Drwięga/Volf/Greeff
Habermeier/Czaban/Gotchac/Redman/Nikolsky
Schoedon/Tang/Adolfsson/Silva/Palm/Hirsz et al.
the same result that syc very very slowly.
hope response ,thank you .
warp sync to a specific block X, and then turn on tracing and full sync from there?
I'm afraid that is not currently possible. I don't know of a deep technical reason for that though so I think it can be done. But it's not a solution to your current problem though.
@EthanHee so your node is not a tracing node then? And using 2.5 it syncs significantly quicker?
@dvdplm I have re-synced two 2.5.13
nodes and two 2.7.2
nodes, with warp sync turned on and warp barrier targetting block 9450000. I realized that we only use trace_replayTransaction
which is part of the Ad-hoc Tracing API and does not require the trace db. Here is what I observed.
2.5.13
syncs to the tip of chain way faster than 2.7.2
. I can't really say how much faster because I ended up killing the 2.7.2
ones before they caught up. By the time 2.5.13
nodes were caught up, the 2.7.2
ones were still > 20000 blocks away from the tip. 2.5.13
and 2.7.2
finished the initial state download and were running full sync, I redirected our system to run against them to see how is the performance of trace_replayTransaction
. What we basically do is simply replaying all transactions in blocks and asking for stateDiff
only. The 2.5.13
nodes were way faster. I didn't do the math, but visually we were getting all state diffs like ~15 seconds per block with 2.7.2
v.s. less than 10 seconds per block with 2.5.13
. (we experimented with both for like 2 hours so it should not really be that just transaction counts or gas cost in blocks are uneven)trace_replayTransaction
calls were flooded to the nodes, 2.5.13
was able to use all the CPUs available to them while 2.7.2
only used 3 or 4 out of 16 cores (we set rpc server_threads
to match the number of CPUs we have on the machine).Let me know if you find anything. We would love to provide logs, debug and PR to help to solve the issue.
2. After both
2.5.13
and2.7.2
finished the initial state download and were running full sync, I redirected our system to run against them to see how is the performance oftrace_replayTransaction
. What we basically do is simply replaying all transactions in blocks and asking forstateDiff
only. The2.5.13
nodes were way faster. I didn't do the math, but visually we were getting all state diffs like ~15 seconds per block with2.7.2
v.s. less than 10 seconds per block with2.5.13
. (we experimented with both for like 2 hours so it should not really be that just transaction counts or gas cost in blocks are uneven)
Do I understand you correctly that here the two nodes are not archive nodes and not tracing, just plain old vanilla full nodes? I have benchmarked an archive+tracing node at both 2.5 and 2.7 today and I have not found any significant performance difference but my node is not fully synced. I have ~2.5 million blocks in the DB. Here are some numbers running trace_replayBlockTransactions
for block 0x1a3d52
which returns a big trace (more than 50Mb) on both versions:
2.7, single big trace, single thread
------------------------------------
$ wrk --latency -c 1 -d 1m -t 1 --timeout 5 -s ./trace.lua http://127.0.0.1:8545
Running 1m test @ http://127.0.0.1:8545
1 threads and 1 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.08s 111.18ms 1.42s 74.55%
Req/Sec 0.22 0.42 1.00 78.18%
Latency Distribution
50% 1.06s
75% 1.16s
90% 1.25s
99% 1.42s
55 requests in 1.00m, 2.72GB read
Requests/sec: 0.92
Transfer/sec: 46.39MB
2.5, single big trace, single thread
------------------------------------
$ wrk --latency -c 1 -d 1m -t 1 --timeout 5 -s ./trace.lua http://127.0.0.1:8545
Running 1m test @ http://127.0.0.1:8545
1 threads and 1 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.15s 104.33ms 1.43s 65.38%
Req/Sec 0.02 0.14 1.00 98.08%
Latency Distribution
50% 1.14s
75% 1.23s
90% 1.29s
99% 1.43s
52 requests in 1.00m, 2.57GB read
Requests/sec: 0.87
Transfer/sec: 43.83MB
Do you think you could run time curl -H 'Content-type: application/json' -X POST -d '{"method":"trace_replayBlockTransactions","params":["0x1a3d52", ["stateDiff", "vmTrace","trace"] ],"id":1,"jsonrpc":"2.0"}' http://localhost:8545 > out-2.7.json
on your 2.5 and 2.7 nodes and report back what timings you get? It's the same block I used with wrk
above and here I see response times between 1.3 and 1.8 seconds on both versions, consistent with the results of wrk
above.
Thanks!
EDIT: I ran my benchmark with --mode offline
to reduce the noise on the node.
Hey @dvdplm, I'm working with @sammy1991106 on this. While we wait for him to get you those numbers, I will say that one of the main concerns is performance under load. It seems as though 2.5
was able to better manage a fleury of trace_replayTransaction
requests while it was syncing. It appears as though 2.5
is able to parallelize more work than 2.7
. I'll have to take another look at the changes between 2.5
and 2.7
and see if I have any suspicions about what could have affected this.
In other words, I suspect that a single request on a single cpu while in offline mode will not yield a significant difference. @sammy1991106, I'll let you prove or disprove my hypothesis.
I'll have to take another look at the changes between
2.5
and2.7
and see if I have any suspicions about what could have affected this.
Lovely, this definitely could use more eyes. One place to look is the switch to jsonrpc
v14 and the changes related to threading, e.g. this. During our testing we did not see any regressions but we did not test on tracing nodes.
In other words, I suspect that a single request on a single cpu while in offline mode will not yield a significant difference.
I agree; what I was after is a confirmation that what I'm seeing on my system matches up with what you guys are seeing. If otoh we have a severe regression on a single query on a single thread I'd be more inclined to look into the database layer.
Have to re-sync to get the block 0x1a3d52
, but in the meantime, I'll post some "benchmarks" I ran with the state that we had.
Block: 0x9146a2
$ time curl -H 'Content-type: application/json' -X POST -d '{"method":"trace_replayBlockTransactions","params":["'$BLOCK'", ["stateDiff", "vmTrace","trace"] ],"id":1,"jsonrpc":"2.0"}' http://localhost:9091 -o $VERSION.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 27.2M 100 27.2M 100 122 13.3M 59 0:00:02 0:00:02 --:--:-- 13.3M
real 0m2.130s
user 0m0.005s
sys 0m0.035s
$ echo $BLOCK $IMAGE
0x9146a2 parity-2.5.13
$ time curl -H 'Content-type: application/json' -X POST -d '{"method":"trace_replayBlockTransactions","params":["'$BLOCK'", ["stateDiff", "vmTrace","trace"] ],"id":1,"jsonrpc":"2.0"}' http://localhost:9091 -o $VERSION.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 27.2M 100 27.2M 100 122 4716k 20 0:00:06 0:00:05 0:00:01 7521k
real 0m5.994s
user 0m0.008s
sys 0m0.037s
$ echo $BLOCK $IMAGE
0x9146a2 parity-2.6.8
$ time curl -H 'Content-type: application/json' -X POST -d '{"method":"trace_replayBlockTransactions","params":["'$BLOCK'", ["stateDiff", "vmTrace","trace"] ],"id":1,"jsonrpc":"2.0"}' http://localhost:9091 -o $VERSION.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 27.2M 100 27.2M 100 122 11.0M 49 0:00:02 0:00:02 --:--:-- 11.0M
real 0m2.557s
user 0m0.004s
sys 0m0.039s
$ echo $BLOCK $IMAGE
0x9146a2 parity-2.7.2
and block 0x914699
$ time curl -H 'Content-type: application/json' -X POST -d '{"method":"trace_replayBlockTransactions","params":["'$BLOCK'", ["stateDiff", "vmTrace","trace"] ],"id":1,"jsonrpc":"2.0"}' http://localhost:9091 -o $VERSION.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 17.9M 100 17.9M 100 122 4248k 28 0:00:04 0:00:04 --:--:-- 4248k
real 0m4.396s
user 0m0.006s
sys 0m0.026s
$ echo $BLOCK $IMAGE
0x914699 parity-2.5.13
$ time curl -H 'Content-type: application/json' -X POST -d '{"method":"trace_replayBlockTransactions","params":["'$BLOCK'", ["stateDiff", "vmTrace","trace"] ],"id":1,"jsonrpc":"2.0"}' http://localhost:9091 -o $VERSION.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 17.9M 100 17.9M 100 122 4360k 28 0:00:04 0:00:04 --:--:-- 4360k
real 0m4.274s
user 0m0.006s
sys 0m0.023s
$ echo $BLOCK $IMAGE
0x914699 parity-2.6.8
$ time curl -H 'Content-type: application/json' -X POST -d '{"method":"trace_replayBlockTransactions","params":["'$BLOCK'", ["stateDiff", "vmTrace","trace"] ],"id":1,"jsonrpc":"2.0"}' http://localhost:9091 -o $VERSION.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 17.9M 100 17.9M 100 122 5255k 34 0:00:03 0:00:03 --:--:-- 5255k
real 0m3.576s
user 0m0.010s
sys 0m0.023s
$ echo $BLOCK $IMAGE
0x914699 parity-2.7.2
I noticed that if I run the query twice, I get a significant speed-up, so obviously there's some caching going on so I'm not sure that running a proper load test really gives useful results beyond the 99 percentile.
@dvdplm, I'll do the same thing for the block you were using too.
Here are the "benchmarks" with block 0x1a3d52
:
$ time curl -H 'Content-type: application/json' -X POST -d '{"method":"trace_replayBlockTransactions","params":["'$BLOCK'", ["stateDiff", "vmTrace","trace"] ],"id":1,"jsonrpc":"2.0"}' http://localhost:9091 -o $VERSION-$BLOCK.json
Warning: The file name argument '-0x1a3d52.json' looks like a flag.
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 50.6M 100 50.6M 100 122 15.4M 37 0:00:03 0:00:03 --:--:-- 15.4M
real 0m3.279s
user 0m0.008s
sys 0m0.066s
$ echo $BLOCK $IMAGE
0x1a3d52 parity-2.5.13
$ time curl -H 'Content-type: application/json' -X POST -d '{"method":"trace_replayBlockTransactions","params":["'$BLOCK'", ["stateDiff", "vmTrace","trace"] ],"id":1,"jsonrpc":"2.0"}' http://localhost:9091 -o $VERSION-$BLOCK.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 50.6M 100 50.6M 100 122 9.9M 24 0:00:05 0:00:05 --:--:-- 13.0M
real 0m5.090s
user 0m0.008s
sys 0m0.066s
$ echo $BLOCK $VERSION
0x1a3d52 parity-2.6.8
$ time curl -H 'Content-type: application/json' -X POST -d '{"method":"trace_replayBlockTransactions","params":["'$BLOCK'", ["stateDiff", "vmTrace","trace"] ],"id":1,"jsonrpc":"2.0"}' http://localhost:9091 -o $VERSION-$BLOCK.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 50.6M 100 50.6M 100 122 16.8M 40 0:00:03 0:00:02 0:00:01 16.8M
real 0m3.004s
user 0m0.008s
sys 0m0.068s
$ echo $BLOCK $VERSION
0x1a3d52 parity-2.7.2
Interestingly enough, I decided to run that last query a few more times to see if there was any speed up and it didn't improve by much, below is the fastest attempt:
$ time curl -H 'Content-type: application/json' -X POST -d '{"method":"trace_replayBlockTransactions","params":["'$BLOCK'", ["stateDiff", "vmTrace","trace"] ],"id":1,"jsonrpc":"2.0"}' http://localhost:9091 -o $VERSION-$BLOCK.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 50.6M 100 50.6M 100 122 20.1M 48 0:00:02 0:00:02 --:--:-- 20.1M
real 0m2.662s
user 0m0.012s
sys 0m0.071s
$ echo $BLOCK $VERSION
0x1a3d52 parity-2.7.2
Since the original posting of this issue, we have found optimizations related to our trace_replayTransaction
strategy, however, that does not help us if the node eventually stops syncing. As we noticed issues related to concurrency and there have been some changes in thread management, it almost seems like our nodes are deadlocking during the syncing process (as they always tend to stop syncing altogether for 2.7.2.
). Is this possible?
While it's possible you're seeing a deadlock I would be more inclined to suspect thread starvation, where too small a pool of threads is continuously busy answering RPCs requests rather than making progress on sync. That's a guess though at this stage.
We actually saw the syncing issue while the rpc server was off.
Same issue here Parity Ethereum Client. version Parity-Ethereum/v2.7.2-stable-2662d1925-20200206/x86_64-unknown-linux-gnu/rustc1.39.0
RPC server is on, but not used. Initial download of blockchain took more than 2 weeks... Now I'm on 8455500 with 2 blk/s
Also have this issue - Is there any way to revert the DB change and go back to 2.5.13 with the data? Early testing did not show this bug but now due to a stuck sync and getting behind it is a major issue as 2.7.2 cannot catch back up. @dvdplm or anyone?
any help or suggestions is greatly appreciated.
@kcanty we're investigating the problem at the moment.
In the meantime your can you this tool to downgrade the database from 2.7 to 2.5.
any updated on this one?
I got same issue. I have two nodes running on aws, one is a m5large vm and the other is run in a kubernetes cluster. The vm one syncs fine, and i used its disk snapshot to create the other node in kubernetes. The container node when using v2.7.2, can never catch up with latest, with mostly 0.00 blk/s. (having a lot of GetNodeData slow warnings) Then I downgrade its db to 2.5 and run it with v2.5.13, it became around 0.6 blk/s to 1blk/s.
I got the same question with parity 2.7.2. After resync with parity 2.5.3(I start a new parity client at the same machine) , everything seems to be good.
update from my side
I got same issue. I have two nodes running on aws, one is a m5large vm and the other is run in a kubernetes cluster. The vm one syncs fine, and i used its disk snapshot to create the other node in kubernetes. The container node when using v2.7.2, can never catch up with latest, with mostly 0.00 blk/s. (having a lot of GetNodeData slow warnings) Then I downgrade its db to 2.5 and run it with v2.5.13, it became around 0.6 blk/s to 1blk/s.
Update from my side: My problem is because I/O, disk recovered from snapshot need to be warming-up to gain its full IOPS. After warming-up, my node can synced to latest block now. Anyway, I think the 2.7.2 version is definetely slower than 2.5.x.
I have the same issue with Parity 2.7.2. It started misbehaving all of a sudden, and it syncs extremely slowly and has "slow fetch" messages as well.
I'm a bit afraid of downgrading though because I don't want to make the problem worse, I need this node to be working as soon as I can...
I'm also worried that get.parity.io seems to no longer exist; Parity was installed on this machine using bash <(curl https://get.parity.io -Lk) -r stable
, so if I mess up I may not even be able to easily fix the mess afterwards. It's on Ubuntu.
What is the general advice in this case? Is the downgrade "safe"?
2.7.2 version... was synced and running fine for weeks... then all of a sudden for no apparent reason... excessive...
2020-04-30 08:25:49 UTC IO Worker #1 WARN sync 9 -> GetNodeData: item 69/384 – slow state fetch for hash 0x60ffe6e056a92b05df4a3de978857e5bdf80c8110165a260df9d092aadeec00d; took Instant { tv_sec: 911, tv_nsec: 784678089 } 2020-04-30 08:25:55 UTC IO Worker #3 WARN sync 9 -> GetNodeData: item 159/384 – slow state fetch for hash 0xa429e294c3388463f448a8ac1c45ba461fbaa0fa3f1151d67b17b9f98f7bcf76; took Instan t { tv_sec: 917, tv_nsec: 98440592 } 2020-04-30 08:25:55 UTC IO Worker #3 WARN sync 9 -> GetNodeData: item 54/384 – slow state fetch for hash 0x048003563c0c3a3ae64ffd0261affa50d2c1cde1a251fd8ee025e49d370ead56; took Instant { tv_sec: 917, tv_nsec: 438773547 } 2020-04-30 08:25:57 UTC IO Worker #3 WARN sync 9 -> GetNodeData: item 5/384 – slow state fetch for hash 0xdf6a6790a8d7bf0fda2b38aaa5b48aa870e02438e32874fb1ddd40e6841ead4a; took Instant { tv_sec: 919, tv_nsec: 108621033 } 2020-04-30 08:25:57 UTC IO Worker #0 WARN sync 9 -> GetNodeData: item 11/384 – slow state fetch for hash 0x85a0e528b093e7f7dd027c0f743f6dd65d74d2d7129a7d659963cc559c541539; took Instant { tv_sec: 919, tv_nsec: 544842060 } 2020-04-30 08:25:59 UTC IO Worker #1 WARN sync 9 -> GetNodeData: item 218/384 – slow state fetch for hash 0x6796bd131c3245f4dcf2d3ffa589a0aa2399c822a054ccb9751ad4a605be1b53; took Instan t { tv_sec: 920, tv_nsec: 869450369 }
Just an update, we are still having the same issue with open-ethereum 3.0.0 with the following configuration
# This config should be placed in following path:
# ~/.local/share/io.parity.ethereum/config.toml
[parity]
# Ethereum Mainnet Network
chain = "mainnet"
# Parity continously syncs the chain
mode = "active"
# No updates will be auto-installed
auto_update = "none"
# Disables auto downloading of new releases. Not recommended.
no_download = true
# Blockchain and settings will be stored in /data.
base_path = "/data"
[rpc]
# Allows Cross-Origin Requests from domain '*'.
cors = ["*"]
# JSON-RPC will be listening for connections on IP all.
interface = "all"
# Only selected APIs will be exposed over this interface.
apis = ["web3", "eth", "net", "traces"]
# Allow connections only using specified addresses.
hosts = ["all"]
# JSON-RPC over HTTP will be accessible on port 9091.
port = 9091
# Threads for handling incoming connections for HTTP JSON-RPC server.
server_threads = 16
[websockets]
# JSON-RPC will be listening for connections on IP all.
interface = "all"
# Allows connecting from Origin 'all'.
origins = ["all"]
# Only selected APIs will be exposed over this interface.
apis = ["pubsub"]
# JSON-RPC over WebSockets will be accessible on port 9092.
port = 9092
[ipc]
# You won't be able to use IPC to interact with Parity.
disable = true
[network]
# Parity will sync by downloading latest state first. Node will be operational in couple minutes.
warp = true
# Parity will maintain additional 25 peers for snapshot sync.
snapshot_peers = 25
# Disable serving light peers.
no_serve_light = true
[footprint]
# Compute and Store tracing data. (Enables trace_* APIs).
tracing = "off"
# Prune old state data. Maintains journal overlay - fast but extra 50MB of memory used.
pruning = "fast"
# Will keep up to 20000 old state entries.
pruning_history = 20000
# If defined will never use more then 32768MB for all caches. (Overrides other cache settings).
cache_size = 4096
[mining]
# Maximum number of transactions per sender in the queue. By default it's 1% of the entire queue, but not less than 16.
tx_queue_per_sender = 256
# Maximum amount of memory that can be used by the transaction queue. Setting this parameter to 0 disables limiting.
tx_queue_mem_limit = 32
# Parity will keep/relay at most 8196 transactions in queue.
tx_queue_size = 8196
@sammy1991106 To be honest, i think 4096MB for all caches is far from enough to keep pruning-history=20000. Maybe 32768MB is also not enough. I give parity 64000MB and it will use upto 128000MB memory.
@jiangxjcn I would agree if that cache
also applies limit on pruning_history
, which I am not sure if it does. The same configuration works fine for 2.5.13 though. I only see this issue with 2.7.2 and 3.0.0.
Redacted comment since it was rather large and later understood as off topic. See https://github.com/openethereum/openethereum/issues/11494#issuecomment-636523424.
@sammy1991106 didn't you notice the opposite (it synced no-problem) on ethereum-classic
mainnet?
For Ethereum Classic, probably related: https://github.com/openethereum/openethereum/issues/11104
@richardpringle No, I didn't. Both 2.7.2 and 3.0.0 work fine for ethereum-classis's Mainnet and Kotti, as well as ethereum's Ropsten.
same issue here
FYI just upgraded my 2.6.8 to 3.0.1 and from 1 miserable peer it currently syncing with 17/25 peers..
For me, 2.7.2 just completely stops syncing every few days. The server still runs and RPC calls are still answered, but it just doesn't get any new blocks. Even though it shows 10+ connections. I have full debug logs of a few instances, but they don't seem to say anything interesting.
I think it got better when I enabled --no-periodic-snapshot
(now only once per week), but it still happens. I wrote a script that just kills and restarts the parity node when it's out of date by more than 10 minutes, that works..
Edit: probably more relevant to #11737
I saw thie errors on OpenEthereum:3.0.1 Any ideas how to resolve this?
@avolkov-dev this help https://github.com/openethereum/openethereum/issues/11494#issuecomment-639457437
Will try thanks!
@avolkov-dev yes the script from @phiresky works well for us - in production we ensure the script is running with monit and have the systemd service file for parity set to restart always.
I have been running 2 parity nodes for ethereum mainnet in a Kubernetes cluster, and just upgraded them from 2.5.13 to 2.7.2. Now syncing blocks become extremely slow (~0.04 blk/s) and they cannot even keep up with the chain.
My TOML configuration file for both nodes is as following
Each node is running in a Kubernetes pod with SSD, 16 CPU, and 64GB of memory. What I notice is despite that
scale_verifiers
is enabled, both nodes only consume 1 CPU all the time as shown below.The logs are much like
Can someone please point out what could have gone wrong with my settings?