Closed gilescope closed 3 years ago
Will try 3rd time but think it will stick at the same place on the replay of the chain....
Hmm restarting it again seemed to get it going past that block. (I didn't nuke the previously downloaded chain)
Hi @gilescope, what is the status of your node now? Did it caught up to the fin length? The warning in your log and the high average ping time do not necessarily indicate a problem.
I left it running overnight and it's up to 130,000. I am guessing the transaction tests made the blocks much bigger from 120,000 or there abouts. It seems to be getting there thout it is warning that the low priority inbound concensus queue is full. Maybe that warning frequency could be turned down a bit as it's happening a lot?
Yes, you are right. There are a lot of transaction in most blocks, much more during the last hours than what we have seen in the days before. The warning indicates that your node is receiving transactions faster than it can process them, so it is dropping some of them. I agree, the warning frequency is very high at he moment.
Hmm well chain is at 160,000 but at this slow rate of getting through the transactions I am not sure that it will catch up. (8 core/4cpu running at 4ghz). Why does it need to peer when catching up? I assumed it could just download the chain, parse it and then peer and get stuck in at the head of the chain...but it seems after downloading and parsing the chain it seems to do some (non-trivial) work understanding each block.
It's covered 23 blocks in 8 mins. (Height of last finalized block now 130123)
Maybe it will go faster if I restart it? I am guessing the chain has moved on 16 blocks in those 8 mins... so at this rate it won't catch up for a long time :-(
Oh dear this seems terminal:
021-02-04T12:24:02.274886800Z: ERROR: TreeState: Database invariant violation: Could not read last finalized block 2021-02-04T12:24:02.275474800Z: ERROR: External: Database invariant violation: Could not read last finalized block Error: ErrorMessage { msg: "Database invariant violation. See logs for details." }
I tried: .\concordium-node-retrieve-logs.exe (in powershell) but got:
Could not perform docker container inspect Os { code: 2, kind: NotFound, message: "The system cannot find the file specified." }
FAILURE Client has not been run - so can't gather logs from it!
Not sure where I go from here?
@gilescope Catching up does mean downloading the chain, but because any single node cannot trust another one, it must validate the blocks it receives. We aim to have a more streamlined catchup mechanism in the future where a node can be brought up more quickly by relying on finalization, but this is currently not the case.
Catchup speed relies a lot on the speed of your peers. Sometimes you peer to slow, or unresponsive peers. Your node only catches up one peer at a time, and there is a timeout, so if one peer is unresponsive after some time you will try another, etc. So it can take some time to catch up. Restarting can help since you will generally acquire new peers. You can also tell your node to connect to specific peers if you have their IP addresses and ports where they are listening on.
Regarding your last message. That is indeed terminal, it means the node's database is corrupt, and the only solution is to clear data, and restart.
You might still be able to retrieve logs by running docker logs concordium-client > concordium-node-logs.txt
.
After that you either have to manually delete the database folder, on Windows it is the folder %LOCALAPPDATA%\\concordium\\database-v3
, or run concordium-node-reset-data
. The latter will also reset your node id, delete your baker keys if you have any, and delete any logs you may have.
Finally, during the last day and a half we have seen a large number of transactions being sent on the network. This would affect catchup speed as well since the nodes you are catching up with, and the network itself, are under more load.
We have an update available. Please follow our instructions on the status page and the Medium post.
Finally have an up to date chain. Will have a play now :-)
I seem to get this at Height of last finalized block 126087:
Couldn't process a finalization message from peer f2c595b977d5c96f due to error code InvalidResult
(On the website I could see: Average ping time Greater than 60s but there's no loss of internet connectivity so I think it not being able to get over this message on the blockchain stopped it responding to the ping.)
I saw in the discord that others also got it but it wasn't clear what the answer was.