Closed litcc closed 2 years ago
Healed trie nodes strongly depends on your snap sync speed. Basically you will sync an old state of the network and need to update it to the newest state. The longer you take snap syncing, the older and more outdated your state is and you will need longer to heal. If you are taking multiple days to heal, you might want to consider stopping the node, deleting the state and starting a snap sync from anew. Usually a state heal is done in < 15h
What do I need to do to properly delete the status without damaging the currently downloaded block data?
The current start command is :
docker run -d --restart unless-stopped --name=eth -p 8545:8545 -p 8546:8546 -p 30303:30303 -v /mnt/ethdatabase:/root/.ethereum ethereum/client-go:v1.10.21 --mainnet --syncmode snap --http --http.api web3,eth,net,admin,personal,debug --http.addr 0.0.0.0 --http.port 8545 --ws --ws.addr 0.0.0.0 --ws.port 8546 --ws.origins=* --maxpeers 1000
Use geth removedb
y
n
This will only remove the state and leave the history intact
Do I need to backup the data? Will it still be slow to sync after deletion and start from 0?
You don't need to backup anything. It might still be slow to sync if your network connection is bad or your disk is too slow. But it might be faster than it was if it doesn't need to download the old blocks
Is this log normal? I see ERROR level errors in it too.
That looks okay to me, you are state syncing eta=74h56m7.909s
doesn't look too great though, your network seems to be not great?
oh, yes, my network here is rather poor
This is the latest log, if eta is synchronizing the remaining time, the speed is acceptable at the moment, after all, we have all waited for more than 2 months before. 😂
State sync in progress synced=88.10% state=171.35GiB accounts=168,001,945@36.86GiB slots=655,186,574@130.92GiB codes=592,749@3.57GiB eta=3h24m35.358s
Is this log normal? I see ERROR level errors in it too.
Sync information when state is not deleted:
{
currentBlock: 15487937,
healedBytecodeBytes: 380132516,
healedBytecodes: 51962,
healedTrienodeBytes: 111404821735,
healedTrienodes: 318328110,
healingBytecode: 0,
healingTrienodes: 3197,
highestBlock: 15488031,
startingBlock: 15470245,
syncedAccountBytes: 38722876923,
syncedAccounts: 158033944,
syncedBytecodeBytes: 2816834450,
syncedBytecodes: 476103,
syncedStorage: 569302144,
syncedStorageBytes: 122028748654
}
Execute the data after the deletion status:
{
currentBlock: 15501384,
healedBytecodeBytes: 0,
healedBytecodes: 0,
healedTrienodeBytes: 3390436,
healedTrienodes: 6373,
healingBytecode: 0,
healingTrienodes: 1108,
highestBlock: 15501449,
startingBlock: 14960010,
syncedAccountBytes: 44923080753,
syncedAccounts: 190132727,
syncedBytecodeBytes: 4305582024,
syncedBytecodes: 661006,
syncedStorage: 769913336,
syncedStorageBytes: 165063448160
}
Hello, according to the data returned by eth.syncing, I found that the node started to synchronize healedTrienodes again and started from 0. I don't know when the synchronization will be finished and the speed is very slow.
We've merged a lot of updates onto master that makes the heal part faster, a lot faster. You might want to update to that.
We've merged a lot of updates onto master that makes the heal part faster, a lot faster. You might want to update to that.
I've used the latest release ethereum/client-go:v1.10.23, isn't this version the latest one?
Maybe you should use ethereum/client-go:latest until v1.11.0 is released.
I used the version: ethereum/client-go@sha256:3e0a278e6a27fb80ac1e91c45f5ee9dffa32d4b3d5fddf454ec40a1342a9148d
At present, the synchronization is still slow, can not determine the completion time of synchronization, the pending value will become large and then small, it is not clear whether it is normal, the logs are as follows.
If your healing is taking forever, this is how I solved it after weeks of trying:
geth removedb y n
as @MariusVanDerWijden recommendedI'm not sure if it's because of the hard drive read/write speed
cat@cat:/mnt/ethdatabase/test$ sudo fio -filename=/mnt/ethdatabase/test/test_randread -direct=1 -iodepth 1 -thread -rw=randread -ioengine=psync -bs=16k -size=2G -numjobs=10 -runtime=60 -group_reporting -name=mytest
mytest: (g=0): rw=randread, bs=(R) 16.0KiB-16.0KiB, (W) 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1
...
fio-3.1
Starting 10 threads
mytest: Laying out IO file (1 file / 2048MiB)
Jobs: 10 (f=10): [r(10)][34.2%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 02m:01s]
mytest: (groupid=0, jobs=10): err= 0: pid=31067: Wed Sep 14 11:05:29 2022
read: IOPS=7049, BW=110MiB/s (116MB/s)(6933MiB/62938msec)
clat (usec): min=67, max=13098k, avg=1413.32, stdev=85185.32
lat (usec): min=67, max=13098k, avg=1413.79, stdev=85185.32
clat percentiles (usec):
| 1.00th=[ 147], 5.00th=[ 482], 10.00th=[ 537], 20.00th=[ 627],
| 30.00th=[ 693], 40.00th=[ 758], 50.00th=[ 824], 60.00th=[ 881],
| 70.00th=[ 947], 80.00th=[ 1012], 90.00th=[ 1106], 95.00th=[ 1188],
| 99.00th=[ 1401], 99.50th=[ 2704], 99.90th=[ 4047], 99.95th=[ 4359],
| 99.99th=[34866]
bw ( KiB/s): min= 1504, max=20512, per=16.35%, avg=18439.90, stdev=2688.99, samples=770
iops : min= 94, max= 1282, avg=1152.49, stdev=168.06, samples=770
lat (usec) : 100=0.10%, 250=2.01%, 500=4.17%, 750=31.99%, 1000=39.70%
lat (msec) : 2=21.44%, 4=0.49%, 10=0.09%, 20=0.01%, 50=0.01%
lat (msec) : 100=0.01%, 500=0.01%, >=2000=0.01%
cpu : usr=0.76%, sys=3.29%, ctx=443842, majf=0, minf=1046
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=443699,0,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=110MiB/s (116MB/s), 110MiB/s-110MiB/s (116MB/s-116MB/s), io=6933MiB (7270MB), run=62938-62938msec
Disk stats (read/write):
dm-0: ios=443699/9, merge=0/0, ticks=478060/66820, in_queue=678800, util=100.00%, aggrios=221849/3, aggrmerge=0/2, aggrticks=304204/14322, aggrin_queue=319260, aggrutil=100.00%
sdg: ios=0/2, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
sdf: ios=443699/5, merge=0/4, ticks=608408/28644, in_queue=638520, util=100.00%
cat@cat:/mnt/ethdatabase/test$ sudo fio -filename=/mnt/ethdatabase/test/test_randrea2 -direct=1 -iodepth 1 -thread -rw=randwrite -ioengine=psync -bs=16k -size=2G -numjobs=10 -runtime=60 -group_reporting -name=mytest
mytest: (g=0): rw=randwrite, bs=(R) 16.0KiB-16.0KiB, (W) 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1
...
fio-3.1
Starting 10 threads
mytest: Laying out IO file (1 file / 2048MiB)
Jobs: 10 (f=10): [w(10)][100.0%][r=0KiB/s,w=42.1MiB/s][r=0,w=2694 IOPS][eta 00m:00s]
mytest: (groupid=0, jobs=10): err= 0: pid=4587: Wed Sep 14 11:17:08 2022
write: IOPS=2124, BW=33.2MiB/s (34.8MB/s)(1992MiB/60003msec)
clat (usec): min=320, max=321827, avg=4700.51, stdev=6167.38
lat (usec): min=321, max=321828, avg=4701.78, stdev=6167.38
clat percentiles (usec):
| 1.00th=[ 1057], 5.00th=[ 1254], 10.00th=[ 1418], 20.00th=[ 2008],
| 30.00th=[ 2540], 40.00th=[ 3032], 50.00th=[ 3425], 60.00th=[ 3851],
| 70.00th=[ 4359], 80.00th=[ 4883], 90.00th=[ 6718], 95.00th=[ 15795],
| 99.00th=[ 28705], 99.50th=[ 32113], 99.90th=[ 49546], 99.95th=[ 98042],
| 99.99th=[173016]
bw ( KiB/s): min= 160, max= 4800, per=10.00%, avg=3400.08, stdev=1375.21, samples=1200
iops : min= 10, max= 300, avg=212.45, stdev=85.92, samples=1200
lat (usec) : 500=0.12%, 750=0.54%, 1000=0.12%
lat (msec) : 2=19.06%, 4=42.89%, 10=29.56%, 20=4.60%, 50=3.01%
lat (msec) : 100=0.06%, 250=0.03%, 500=0.01%
cpu : usr=0.20%, sys=1.82%, ctx=128874, majf=0, minf=144
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=0,127479,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=33.2MiB/s (34.8MB/s), 33.2MiB/s-33.2MiB/s (34.8MB/s-34.8MB/s), io=1992MiB (2089MB), run=60003-60003msec
Disk stats (read/write):
dm-0: ios=0/127341, merge=0/0, ticks=0/587604, in_queue=613852, util=100.00%, aggrios=0/63915, aggrmerge=0/0, aggrticks=0/294464, aggrin_queue=294738, aggrutil=99.92%
sdg: ios=0/232, merge=0/0, ticks=0/92, in_queue=92, util=0.03%
sdf: ios=0/127598, merge=0/1, ticks=0/588836, in_queue=589384, util=99.92%
cat@cat:/mnt/ethdatabase/test$ sudo fio -filename=/mnt/ethdatabase/test/test_randrea3 -direct=1 -iodepth 1 -thread -rw=randrw -rwmixread=70 -ioengine=psync -bs=16k -size=2G -numjobs=10 -runtime=60 -group_reporting -name=mytest -ioscheduler=noop
mytest: (g=0): rw=randrw, bs=(R) 16.0KiB-16.0KiB, (W) 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1
...
fio-3.1
Starting 10 threads
mytest: Laying out IO file (1 file / 2048MiB)
fio: io scheduler is not tunable
fio: io scheduler is not tunable
fio: io scheduler is not tunable
fio: io scheduler is not tunable
fio: io scheduler is not tunable
fio: io scheduler is not tunable
fio: io scheduler is not tunable
fio: io scheduler is not tunable
fio: io scheduler is not tunable
fio: io scheduler is not tunable
Jobs: 10 (f=10): [m(10)][100.0%][r=23.7MiB/s,w=9689KiB/s][r=1519,w=605 IOPS][eta 00m:00s]
mytest: (groupid=0, jobs=10): err= 0: pid=7102: Wed Sep 14 11:22:08 2022
read: IOPS=1394, BW=21.8MiB/s (22.9MB/s)(1308MiB/60003msec)
clat (usec): min=83, max=296123, avg=5296.99, stdev=4347.98
lat (usec): min=83, max=296124, avg=5297.50, stdev=4347.98
clat percentiles (msec):
| 1.00th=[ 3], 5.00th=[ 4], 10.00th=[ 4], 20.00th=[ 4],
| 30.00th=[ 5], 40.00th=[ 5], 50.00th=[ 5], 60.00th=[ 5],
| 70.00th=[ 6], 80.00th=[ 7], 90.00th=[ 9], 95.00th=[ 10],
| 99.00th=[ 16], 99.50th=[ 18], 99.90th=[ 27], 99.95th=[ 68],
| 99.99th=[ 228]
bw ( KiB/s): min= 64, max= 3142, per=10.00%, avg=2232.38, stdev=434.59, samples=1198
iops : min= 4, max= 196, avg=139.47, stdev=27.15, samples=1198
write: IOPS=602, BW=9644KiB/s (9876kB/s)(565MiB/60003msec)
clat (usec): min=232, max=228994, avg=4312.24, stdev=3227.87
lat (usec): min=233, max=228996, avg=4313.42, stdev=3227.87
clat percentiles (msec):
| 1.00th=[ 3], 5.00th=[ 3], 10.00th=[ 4], 20.00th=[ 4],
| 30.00th=[ 4], 40.00th=[ 4], 50.00th=[ 4], 60.00th=[ 5],
| 70.00th=[ 5], 80.00th=[ 5], 90.00th=[ 6], 95.00th=[ 7],
| 99.00th=[ 13], 99.50th=[ 16], 99.90th=[ 24], 99.95th=[ 68],
| 99.99th=[ 140]
bw ( KiB/s): min= 32, max= 1603, per=10.00%, avg=964.64, stdev=227.13, samples=1198
iops : min= 2, max= 100, avg=60.27, stdev=14.19, samples=1198
lat (usec) : 100=0.02%, 250=0.28%, 500=0.04%, 750=0.02%, 1000=0.03%
lat (msec) : 2=0.46%, 4=33.04%, 10=62.26%, 20=3.56%, 50=0.22%
lat (msec) : 100=0.04%, 250=0.03%, 500=0.01%
cpu : usr=0.20%, sys=0.80%, ctx=120904, majf=0, minf=1
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=83699,36167,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=21.8MiB/s (22.9MB/s), 21.8MiB/s-21.8MiB/s (22.9MB/s-22.9MB/s), io=1308MiB (1371MB), run=60003-60003msec
WRITE: bw=9644KiB/s (9876kB/s), 9644KiB/s-9644KiB/s (9876kB/s-9876kB/s), io=565MiB (593MB), run=60003-60003msec
Disk stats (read/write):
dm-0: ios=83659/36158, merge=0/0, ticks=440292/153992, in_queue=597572, util=100.00%, aggrios=41849/18087, aggrmerge=0/0, aggrticks=220124/77008, aggrin_queue=297178, aggrutil=99.89%
sdg: ios=0/4, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
sdf: ios=83699/36171, merge=0/1, ticks=440248/154016, in_queue=594356, util=99.89%
These are 3 sets of disk performance test data
Your IOPS is quite low.. you need to increase the IOPS and the BW. It is critical for healing.
Upgrading it alone will not solve the problem until you do geth removedb y n
first.
ok, before the hard disk using the software way of disk array, resulting in read and write speed under, the subsequent switch to the hardware way disk array, read and write speed is fully sufficient, the current node has been synchronized to complete! Thank you for your replies!
This should only be used in very rare cases e.g. if you are not 100% sure if something is a bug or asking a question that leads to improving the documentation. For general questions please use discord or the Ethereum stack exchange at https://ethereum.stackexchange.com.
How can I check the latest healedTrienodes height of the main network?
Has been synchronized for nearly a few months, data synchronization speed has been very slow, do not know where the problem, healedTrienodes height can rise a 50 million per day, do not know when to synchronize the end