IntersectMBO / cardano-node

The core component that is used to participate in a Cardano decentralised blockchain.
https://cardano.org
Apache License 2.0
3.06k stars 721 forks source link

Node on preproduction network gets stuck at slot 27059485 while syncing #5379

Closed Zsargul closed 10 months ago

Zsargul commented 1 year ago

Environment:

OS: Debian 10 Memory: 32GB RAM, 8GB Swap Processor: 8 cores, 2.6GHz cardano-cli version: 8.1.1 cardano-node version: 8.1.1

Problem:

I have set up a node using the preproduction testnet configuration. When I run my node it syncs fine up until it reaches the slot 27059485, on the block 883584 on epoch 66, and the syncing progress freezes at ~83% every time, as shown by both gLiveView.sh and cardano-cli query tip --testnet-magic 1 :

{
    "block": 883584,
    "epoch": 66,
    "era": "Babbage",
    "hash": "7a049cd5d67887518e284990c2d681d222b6145e3f664dd5c88ebeced3ded96f",
    "slot": 27059485,
    "slotInEpoch": 189085,
    "slotsToEpochEnd": 242915,
    "syncProgress": "83.69"
}

I have tried reinstalling and re-configuring the node with a handful of painstakingly scrutinous times using the cardano developer docs, but have been met with this exact problem every time. The node syncs for about 2 hours, at which point it freezes at the exact same block at epoch 66.

While the node syncs properly for the first while, journalctl reports the expected Chain extended, new tip message. Once it reaches the slot 27059485 however, the Chain extended logging stops and instead journalctl reports a collection of Info messages related to peers and connections which repeat continually, as shown by this log snippet:

[2023-07-02 22:17:16.92 UTC] TrConnectionManagerCounters (ConnectionManagerCounters {fullDuplexConns = 1, duplexConns = 26, unidirectionalConns = 4, inboundConns = 1, outboundConns = 32})
[2023-07-02 22:17:16.92 UTC] PeerStatusChanged (ColdToWarm (Just 185.15.244.215:3001) 144.24.168.10:3003)
[2023-07-02 22:17:16.92 UTC] TracePromoteColdDone 40 30 144.24.168.10:3003
[2023-07-02 22:17:16.92 UTC] PeerSelectionCounters {coldPeers = 50, warmPeers = 28, hotPeers = 2, localRoots = []}
[2023-07-02 22:17:16.92 UTC] TrInboundGovernorCounters (InboundGovernorCounters {coldPeersRemote = 12,
[2023-07-02 22:17:16.99 UTC] TrConnectionManagerCounters (ConnectionManagerCounters {fullDuplexConns = 1, duplexConns = 26, unidirectionalConns = 4, inboundConns = 1, outboundConns = 33})
[2023-07-02 22:17:17.10 UTC] TrConnectionHandler (ConnectionId {localAddress = 185.15.244.215:3001, remoteAddress = 35.185.48.55:3002}) (TrHandshakeSuccess NodeToNodeV_10 (NodeToNodeVersionData {networkMagic = NetworkMagic {unNetworkMagic = 1}, diffusionMode = InitiatorAndResponderDiffusionMode, peerSharing = NoPeerSharing,
[2023-07-02 22:17:17.10 UTC] TrConnectionManagerCounters (ConnectionManagerCounters {fullDuplexConns = 1, duplexConns = 27, unidirectionalConns = 4, inboundConns = 1, outboundConns = 33})
[2023-07-02 22:17:17.10 UTC] PeerStatusChanged (ColdToWarm (Just 185.15.244.215:3001) 35.185.48.55:3002)
[2023-07-02 22:17:17.10 UTC] TracePromoteColdDone 40 31 35.185.48.55:3002
[2023-07-02 22:17:17.10 UTC] TrInboundGovernorCounters (InboundGovernorCounters {coldPeersRemote = 12,
[2023-07-02 22:17:17.10 UTC] PeerSelectionCounters {coldPeers = 49, warmPeers = 29, hotPeers = 2, localRoots = []}

My configuration files are straight from the official cardano docs and I have not altered them in any way because I'm not familiar enough with them.

I'm at quite a loss as to how to fix this so my node can sync to the preproduction testnet properly, and any help would be appreciated.

Zsargul commented 1 year ago

Update: I have configured another node, using the preview testnet configuration files instead of preproduction testnet and it has fully synced after about 2 hours. I suppose this implies that there is something wrong with the .json files used for my preproduction configuration.

GrzegorzDrozda commented 1 year ago

I have the same error with same config. Just to add I found following error:

TrConnectionHandler (ConnectionId {localAddress = 10.64.0.68:3010, remoteAddress = 140.238.99.46:6000}) (TrConnectionHandlerError OutboundError (InvalidBlock (At (Block {blockPointSlot = SlotNo 27059560, blockPointHash = c3bd77f34ebf5c6aeccf11e19103c949dfa21fa17377dca674da3bae56d4fa5a})) a634bc4cb56255b20a35a2ca6958a17225a40296cd3fbeb484e6abe3af6cb76c (ValidationError (ExtValidationErrorLedger (HardForkLedgerErrorFromEra S (S (S (S (S (Z (WrapLedgerErr {unwrapLedgerErr = BBodyError (BlockTransitionError [ShelleyInAlonzoBbodyPredFailure (LedgersFailure (LedgerFailure (UtxowFailure (MalformedReferenceScripts (fromList [ScriptHash "9adac5980bfb37b1dade70f7bc4de76f34a7f305c9dcdbe2c00d5f5c"])))))])})))))))))) ShutdownPeer)
nexuscrypt commented 1 year ago

I have the same issue.

nexuscrypt commented 1 year ago

Fix:

1) Downgrade to 1.35.6 and delete DB 2) Let node sync 3) Upgrade to 8.1.1

Should be working after this.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 120 days.

github-actions[bot] commented 10 months ago

This issue was closed because it has been stalled for 120 days with no activity. Remove stale label or comment or this will be closed in 60 days.