vegaprotocol / vega

A Go implementation of the Vega Protocol, a protocol for creating and trading derivatives on a fully decentralised network.
https://vega.xyz
GNU Affero General Public License v3.0
36 stars 22 forks source link

Go level DB cannot be open in the data-node #7938

Closed daniel1302 closed 1 year ago

daniel1302 commented 1 year ago

Problem encountered

After the protocol upgrade we are not able to open go level DB in the data node. We are getting this error:


failed to create and publish segment: failed to create snapshot: failed to get data dump metadata: failed to get database version: FATAL: terminating connection due to administrator command (SQLSTATE 57P01)
failed to initialise network history:failed to create networkHistory service:failed to create network history store:failed to create index:failed to open level db file:resource temporarily unavailable

We saw this error, but there was a fix to close the go-level db properly, and we haven't seen it for 2 weeks.

You can see the pipeline that failed here: https://jenkins.ops.vega.xyz/job/common/job/system-tests-lnl-mainnet/291

Observed behaviour

Data node is failing after protocol upgrade.

Expected behaviour

Data node should not fail

System response

Data-node stderr:


2023-03-22T19:07:41.941Z    WARN    p2pnode libp2p/pnet.go:57   This might be configuration mistake.
2023-03-22T19:07:41.941Z    DEBUG   dht go-libp2p-kad-dht@v0.20.0/routing.go:589    finding peer    {"peer": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:07:41.941Z    DEBUG   bootstrap   bootstrap/bootstrap.go:174  failed to bootstrap with 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP: failed to find peers: failed to find any peer in table
2023-03-22T19:07:41.941Z    DEBUG   bootstrap   bootstrap/bootstrap.go:90   12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP bootstrap error: failed to bootstrap. failed to find peers: failed to find any peer in table
2023-03-22T19:07:46.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:07:51.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:07:56.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:08:01.896Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:08:06.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:08:11.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:08:11.895Z    DEBUG   basichost   basic/basic_host.go:718 host 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP dialing 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP
2023-03-22T19:08:11.895Z    DEBUG   swarm2  swarm/swarm_dial.go:244 dialing peer    {"from": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP", "to": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:08:11.895Z    WARN    dht go-libp2p-kad-dht@v0.20.0/dht.go:500    failed to bootstrap {"peer": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP", "error": "dial to self attempted"}
2023-03-22T19:08:11.915Z    WARN    p2pnode libp2p/pnet.go:56   We are in private network and have no peers.
2023-03-22T19:08:11.916Z    WARN    p2pnode libp2p/pnet.go:57   This might be configuration mistake.
2023-03-22T19:08:11.915Z    DEBUG   bootstrap   bootstrap/bootstrap.go:149  12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP bootstrapping to 4 nodes: [{12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP: [/ip4/127.0.0.1/tcp/8013]}]
2023-03-22T19:08:11.916Z    DEBUG   bootstrap   bootstrap/bootstrap.go:170  12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP bootstrapping to 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP
2023-03-22T19:08:11.916Z    DEBUG   basichost   basic/basic_host.go:718 host 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP dialing 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP
2023-03-22T19:08:11.916Z    DEBUG   swarm2  swarm/swarm_dial.go:244 dialing peer    {"from": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP", "to": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:08:11.917Z    DEBUG   dht go-libp2p-kad-dht@v0.20.0/routing.go:589    finding peer    {"peer": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:08:11.917Z    DEBUG   dht go-libp2p-kad-dht@v0.20.0/routing.go:589    finding peer    {"peer": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:08:11.917Z    DEBUG   bootstrap   bootstrap/bootstrap.go:174  failed to bootstrap with 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP: failed to find peers: failed to find any peer in table
2023-03-22T19:08:11.917Z    DEBUG   bootstrap   bootstrap/bootstrap.go:90   12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP bootstrap error: failed to bootstrap. failed to find peers: failed to find any peer in table
2023-03-22T19:08:16.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:08:21.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:08:26.896Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:08:31.897Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:08:36.897Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:08:41.896Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:08:41.915Z    DEBUG   bootstrap   bootstrap/bootstrap.go:149  12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP bootstrapping to 4 nodes: [{12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP: [/ip4/127.0.0.1/tcp/8013]}]
2023-03-22T19:08:41.915Z    DEBUG   bootstrap   bootstrap/bootstrap.go:170  12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP bootstrapping to 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP
2023-03-22T19:08:41.915Z    DEBUG   basichost   basic/basic_host.go:718 host 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP dialing 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP
2023-03-22T19:08:41.915Z    DEBUG   swarm2  swarm/swarm_dial.go:244 dialing peer    {"from": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP", "to": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:08:41.915Z    WARN    p2pnode libp2p/pnet.go:56   We are in private network and have no peers.
2023-03-22T19:08:41.915Z    WARN    p2pnode libp2p/pnet.go:57   This might be configuration mistake.
2023-03-22T19:08:41.915Z    DEBUG   dht go-libp2p-kad-dht@v0.20.0/routing.go:589    finding peer    {"peer": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:08:41.915Z    DEBUG   dht go-libp2p-kad-dht@v0.20.0/routing.go:589    finding peer    {"peer": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:08:41.915Z    DEBUG   bootstrap   bootstrap/bootstrap.go:174  failed to bootstrap with 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP: failed to find peers: failed to find any peer in table
2023-03-22T19:08:41.915Z    DEBUG   bootstrap   bootstrap/bootstrap.go:90   12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP bootstrap error: failed to bootstrap. failed to find peers: failed to find any peer in table
2023-03-22T19:08:46.896Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:08:51.897Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:08:56.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:09:01.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:09:06.896Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:09:11.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:09:11.914Z    DEBUG   bootstrap   bootstrap/bootstrap.go:149  12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP bootstrapping to 4 nodes: [{12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP: [/ip4/127.0.0.1/tcp/8013]}]
2023-03-22T19:09:11.914Z    DEBUG   bootstrap   bootstrap/bootstrap.go:170  12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP bootstrapping to 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP
2023-03-22T19:09:11.914Z    DEBUG   basichost   basic/basic_host.go:718 host 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP dialing 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP
2023-03-22T19:09:11.915Z    DEBUG   swarm2  swarm/swarm_dial.go:244 dialing peer    {"from": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP", "to": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:09:11.915Z    DEBUG   dht go-libp2p-kad-dht@v0.20.0/routing.go:589    finding peer    {"peer": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:09:11.915Z    WARN    p2pnode libp2p/pnet.go:56   We are in private network and have no peers.
2023-03-22T19:09:11.915Z    WARN    p2pnode libp2p/pnet.go:57   This might be configuration mistake.
2023-03-22T19:09:11.915Z    DEBUG   dht go-libp2p-kad-dht@v0.20.0/routing.go:589    finding peer    {"peer": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:09:11.915Z    DEBUG   bootstrap   bootstrap/bootstrap.go:174  failed to bootstrap with 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP: failed to find peers: failed to find any peer in table
2023-03-22T19:09:11.915Z    DEBUG   bootstrap   bootstrap/bootstrap.go:90   12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP bootstrap error: failed to bootstrap. failed to find peers: failed to find any peer in table
2023-03-22T19:09:16.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:09:21.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:09:26.896Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:09:31.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:09:36.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:09:41.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:09:41.914Z    WARN    p2pnode libp2p/pnet.go:56   We are in private network and have no peers.
2023-03-22T19:09:41.914Z    WARN    p2pnode libp2p/pnet.go:57   This might be configuration mistake.
2023-03-22T19:09:41.914Z    DEBUG   bootstrap   bootstrap/bootstrap.go:149  12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP bootstrapping to 4 nodes: [{12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP: [/ip4/127.0.0.1/tcp/8013]}]
2023-03-22T19:09:41.914Z    DEBUG   bootstrap   bootstrap/bootstrap.go:170  12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP bootstrapping to 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP
2023-03-22T19:09:41.914Z    DEBUG   basichost   basic/basic_host.go:718 host 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP dialing 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP
2023-03-22T19:09:41.914Z    DEBUG   swarm2  swarm/swarm_dial.go:244 dialing peer    {"from": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP", "to": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:09:41.914Z    DEBUG   dht go-libp2p-kad-dht@v0.20.0/routing.go:589    finding peer    {"peer": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:09:41.914Z    DEBUG   dht go-libp2p-kad-dht@v0.20.0/routing.go:589    finding peer    {"peer": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:09:41.914Z    DEBUG   bootstrap   bootstrap/bootstrap.go:174  failed to bootstrap with 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP: failed to find peers: failed to find any peer in table
2023-03-22T19:09:41.914Z    DEBUG   bootstrap   bootstrap/bootstrap.go:90   12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP bootstrap error: failed to bootstrap. failed to find peers: failed to find any peer in table
2023-03-22T19:09:46.896Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:09:51.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:09:56.896Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:10:01.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:10:06.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:10:11.895Z    DEBUG   basichost   basic/basic_host.go:320 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2023-03-22T19:10:11.895Z    DEBUG   basichost   basic/basic_host.go:718 host 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP dialing 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP
2023-03-22T19:10:11.895Z    DEBUG   swarm2  swarm/swarm_dial.go:244 dialing peer    {"from": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP", "to": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:10:11.896Z    WARN    dht go-libp2p-kad-dht@v0.20.0/dht.go:500    failed to bootstrap {"peer": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP", "error": "dial to self attempted"}
2023-03-22T19:10:11.914Z    DEBUG   bootstrap   bootstrap/bootstrap.go:149  12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP bootstrapping to 4 nodes: [{12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP: [/ip4/127.0.0.1/tcp/8013]}]
2023-03-22T19:10:11.914Z    DEBUG   bootstrap   bootstrap/bootstrap.go:170  12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP bootstrapping to 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP
2023-03-22T19:10:11.914Z    DEBUG   basichost   basic/basic_host.go:718 host 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP dialing 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP
2023-03-22T19:10:11.914Z    DEBUG   swarm2  swarm/swarm_dial.go:244 dialing peer    {"from": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP", "to": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:10:11.914Z    DEBUG   dht go-libp2p-kad-dht@v0.20.0/routing.go:589    finding peer    {"peer": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:10:11.914Z    WARN    p2pnode libp2p/pnet.go:56   We are in private network and have no peers.
2023-03-22T19:10:11.914Z    WARN    p2pnode libp2p/pnet.go:57   This might be configuration mistake.
2023-03-22T19:10:11.914Z    DEBUG   dht go-libp2p-kad-dht@v0.20.0/routing.go:589    finding peer    {"peer": "12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP"}
2023-03-22T19:10:11.914Z    DEBUG   bootstrap   bootstrap/bootstrap.go:174  failed to bootstrap with 12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP: failed to find peers: failed to find any peer in table
2023-03-22T19:10:11.914Z    DEBUG   bootstrap   bootstrap/bootstrap.go:90   12D3KooWPzFoHc8RwhKAbFz3z6Jj32qMGU6TrnX5aEP7FTMxMjDP bootstrap error: failed to bootstrap. failed to find peers: failed to find any peer in table
2023-03-22T19:10:12.057Z    DEBUG   blockservice    go-blockservice@v0.5.0/blockservice.go:401  blockservice is shutting down...
2023-03-22T19:10:12.057Z    DEBUG   bitswap-server  server/server.go:285    bitswap task worker shutting down...
2023-03-22T19:10:12.057Z    DEBUG   bitswap-server  server/server.go:285    bitswap task worker shutting down...
2023-03-22T19:10:12.057Z    DEBUG   bitswap-server  server/server.go:285    bitswap task worker shutting down...
2023-03-22T19:10:12.057Z    DEBUG   bitswap-server  server/server.go:285    bitswap task worker shutting down...
2023-03-22T19:10:12.057Z    DEBUG   bitswap-server  server/server.go:285    bitswap task worker shutting down...
2023-03-22T19:10:12.057Z    DEBUG   bitswap-server  server/server.go:285    bitswap task worker shutting down...
2023-03-22T19:10:12.057Z    DEBUG   bitswap-server  server/server.go:285    bitswap task worker shutting down...
2023-03-22T19:10:12.057Z    DEBUG   bitswap-server  server/server.go:285    bitswap task worker shutting down...
2023-03-22T19:10:12.073Z    INFO    peering peering/peering.go:219  stopping
2023-03-22T19:10:12.073Z    DEBUG   autorelay   autorelay/relay_finder.go:641   stopping relay finder
failed to create and publish segment: failed to create snapshot: failed to get data dump metadata: failed to get database version: FATAL: terminating connection due to administrator command (SQLSTATE 57P01)
failed to initialise network history:failed to create networkHistory service:failed to create network history store:failed to create index:failed to open level db file:resource temporarily unavailable
Error: maximum number of possible restarts has been reached: failed to execute binary /jenkins/workspace/common/system-tests-lnl-mainnet/networkdata/testnet/visor/visor13/current/vega [datanode node --home /jenkins/workspace/common/system-tests-lnl-mainnet/networkdata/testnet/data-node/node13]: exit status 255
Usage:
  vegavisor run [flags]

Flags:
  -h, --help          help for run
      --home string   Path to visor home folder

Data node STD OUT:


2023-03-22T19:10:12.072Z    INFO    tendermint  service/service.go:176  service stop    {"msg": "Stopping Peer service", "impl": "Peer{MConn{127.0.0.1:51996} 730ffca8617c35a3768d3729e9d40d9647620fc0 in}"}
2023-03-22T19:10:12.073Z    INFO    tendermint  service/service.go:176  service stop    {"msg": "Stopping MConnection service", "impl": "MConn{127.0.0.1:51996}"}
2023-03-22T19:10:12.073Z    INFO    tendermint  service/service.go:176  service stop    {"msg": "Stopping Peer service", "impl": "Peer{MConn{127.0.0.1:52002} 185f59bdd6343c871dec7b1bf08013b7149e6bcd in}"}
2023-03-22T19:10:12.073Z    INFO    tendermint  service/service.go:176  service stop    {"msg": "Stopping MConnection service", "impl": "MConn{127.0.0.1:52002}"}
2023-03-22T19:10:12.073Z    INFO    tendermint  service/service.go:176  service stop    {"msg": "Stopping Peer service", "impl": "Peer{MConn{127.0.0.1:52010} 59846497ccbc3483ba17f239ef225570ded1badf in}"}
2023-03-22T19:10:12.073Z    INFO    tendermint  service/service.go:176  service stop    {"msg": "Stopping MConnection service", "impl": "MConn{127.0.0.1:52010}"}
2023-03-22T19:10:12.073Z    INFO    tendermint  service/service.go:176  service stop    {"msg": "Stopping Peer service", "impl": "Peer{MConn{127.0.0.1:52018} adc00aa46b3f117c15c1afb07f302157b0933fac in}"}
2023-03-22T19:10:12.073Z    INFO    tendermint  service/service.go:176  service stop    {"msg": "Stopping MConnection service", "impl": "MConn{127.0.0.1:52018}"}
2023-03-22T19:10:12.073Z    INFO    tendermint  service/service.go:176  service stop    {"msg": "Stopping Peer service", "impl": "Peer{MConn{127.0.0.1:52026} f377f4304d7b49cd9a5bd4177e1d72a85610f318 in}"}
2023-03-22T19:10:12.073Z    INFO    tendermint  service/service.go:176  service stop    {"msg": "Stopping MConnection service", "impl": "MConn{127.0.0.1:52026}"}
2023-03-22T19:10:12.073Z    DEBUG   tendermint  p2p/switch.go:254   Switch: Stopping reactors
2023-03-22T19:10:12.073Z    INFO    tendermint  service/service.go:176  service stop    {"msg": "Stopping Evidence service", "impl": "Evidence"}
2023-03-22T19:10:12.073Z    INFO    tendermint  service/service.go:176  service stop    {"msg": "Stopping StateSync service", "impl": "StateSync"}
2023-03-22T19:10:12.073Z    INFO    tendermint  service/service.go:176  service stop    {"msg": "Stopping BlockchainReactor service", "impl": "BlockchainReactor"}
2023-03-22T19:10:12.073Z    DEBUG   tendermint  service/service.go:185  service stop    {"msg": "Stopping BlockPool service (already stopped)", "impl": "BlockPool"}
2023-03-22T19:10:12.073Z    ERROR   tendermint  v0/reactor.go:135   Error stopping pool {"err": "already stopped"}
2023-03-22T19:10:12.073Z    INFO    tendermint  service/service.go:176  service stop    {"msg": "Stopping Consensus service", "impl": "ConsensusReactor"}
2023-03-22T19:10:12.073Z    INFO    tendermint  service/service.go:176  service stop    {"msg": "Stopping State service", "impl": "ConsensusState"}
2023-03-22T19:10:12.073Z    INFO    tendermint  service/service.go:176  service stop    {"msg": "Stopping TimeoutTicker service", "impl": "TimeoutTicker"}
2023-03-22T19:10:12.068Z    DEBUG   tendermint  conn/connection.go:324  Flush   {"conn": "MConn{127.0.0.1:52018}"}
2023-03-22T19:10:12.073Z    DEBUG   tendermint  conn/connection.go:327  MConnection flush failed    {"err": "write tcp 127.0.0.1:26313->127.0.0.1:52018: use of closed network connection"}
2023-03-22T19:10:12.071Z    DEBUG   tendermint  consensus/reactor.go:1153   Sending vote message    {"ps": "PeerState{\n  Key        59846497ccbc3483ba17f239ef225570ded1badf\n  RoundState PeerRoundState{\n    279/0/RoundStepPrevote @2023-03-22 19:10:11.780573807 +0000 UTC\n    Proposal 1:02335232F32D -> BA{1:x}\n    POL      BA{13:_____________} (round -1)\n    Prevotes   BA{13:xxxxxxxxxxxxx}\n    Precommits BA{13:xx__x_xxxx_xx}\n    LastCommit BA{13:xxxxxxxxxxxx_} (round 0)\n    Catchup    BA{13:xx__x_xxxx_xx} (round 0)\n  }\n  Stats      peerStateStats{votes: 488, blockParts: 18}\n}", "vote": "Vote{5:B2D5BB2E5167 279/00/SIGNED_MSG_TYPE_PRECOMMIT(Precommit) 98DEF4380BBB 61E6CB71AE7B @ 2023-03-22T19:10:11.707909506Z}"}
2023-03-22T19:10:12.073Z    DEBUG   tendermint  consensus/reactor.go:777    No votes to send, sleeping  {"rs.Height": 279, "prs.Height": 279, "localPV": "BA{13:xxxxxxxxxxxxx}", "peerPV": "BA{13:xxxxxxxxxxxxx}", "localPC": "BA{13:xx__xxxxxx_xx}", "peerPC": "BA{13:xx__x_xxxx_xx}"}
2023-03-22T19:10:12.087Z    INFO    datanode.networkHistory.service.store   store/store.go:223  Closing LevelDB
2023-03-22T19:10:12.087Z    INFO    datanode.api.grpc   api/server.go:447   Gracefully stopping gRPC based API
2023-03-22T19:10:12.087Z    INFO    datanode.admin.server   admin/server.go:85  Stopping Data Node Admin Server<>RPC based API
2023-03-22T19:10:12.087Z    INFO    datanode.gateway.restproxy  rest/server.go:191  Stopping REST<>GRPC based API
2023-03-22T19:10:12.087Z    INFO    datanode.broker.eventsource.socket-server   broker/socket_server.go:195 Closing socket server
2023-03-22T19:10:12.088Z    INFO    core.protocol.broker.socket-client  broker/socket_client.go:86  New broker connection event {"eventType": "Detached", "id": 213119520, "address": "tcp://127.0.0.1:5013"}
2023-03-22T19:10:12.088Z    INFO    datanode.gateway.gql    graphql/server.go:282   Stopping GraphQL based API
2023-03-22T19:10:12.088Z    INFO    datanode.broker.eventsource.socket-server   broker/socket_server.go:90  New broker connection event {"eventType": "Detached", "id": 214802800, "address": "tcp://[::]:5013"}
2023-03-22T19:10:12.089Z    ERROR   tendermint  p2p/switch.go:854   Won't start a peer - switch is not running  {"peer": "Peer{MConn{127.0.0.1:55358} 185f59bdd6343c871dec7b1bf08013b7149e6bcd in}"}
2023-03-22T19:10:12.129Z    ERROR   tendermint  p2p/switch.go:854   Won't start a peer - switch is not running  {"peer": "Peer{MConn{127.0.0.1:55372} 5d11fa6cc7a0ba6b05dc3e12df0b2ebeab122fe1 in}"}
2023-03-22T19:10:12.136Z    ERROR   tendermint  p2p/switch.go:854   Won't start a peer - switch is not running  {"peer": "Peer{MConn{127.0.0.1:55388} 59846497ccbc3483ba17f239ef225570ded1badf in}"}
2023-03-22T19:10:22.088Z    INFO    datanode.api.grpc   api/server.go:455   Force stopping gRPC based API
2023-03-22T19:10:22.203Z    DEBUG   visor   visor/binaries_runner.go:103    Killing binary  {"binaryPath": "/jenkins/workspace/common/system-tests-lnl-mainnet/networkdata/testnet/visor/visor13/current/vega"}
2023-03-22T19:10:22.419Z    ERROR   visor   visor/visor.go:137  Binaries executions has failed  {"error": "failed to execute binary /jenkins/workspace/common/system-tests-lnl-mainnet/networkdata/testnet/visor/visor13/current/vega [datanode node --home /jenkins/workspace/common/system-tests-lnl-mainnet/networkdata/testnet/data-node/node13]: exit status 255"}
2023-03-22T19:10:22.420Z    INFO    visor   visor/visor.go:144  Binaries restart is scheduled   {"restartDelay": "10s"}
2023-03-22T19:10:32.420Z    INFO    visor   visor/visor.go:146  Restarting binaries {"remainingRestarts": 0}
2023-03-22T19:10:32.421Z    INFO    visor   visor/visor.go:119  Starting binaries
2023-03-22T19:10:32.421Z    DEBUG   visor   visor/visor.go:165  failed to get upgrade status from API   {"error": "failed to call protocolupgrade.UpgradeStatus method: failed to post data \"{\\\"method\\\":\\\"protocolupgrade.UpgradeStatus\\\",\\\"params\\\":[null],\\\"id\\\":6298431292859510122}\": Post \"http://unix/rpc\": dial unix /jenkins/workspace/common/system-tests-lnl-mainnet/networkdata/testnet/vega/node13/vega.sock: connect: connection refused"}
2023-03-22T19:10:32.421Z    DEBUG   visor   visor/binaries_runner.go:81 Starting binary {"binaryPath": "/jenkins/workspace/common/system-tests-lnl-mainnet/networkdata/testnet/visor/visor13/current/vega", "args": ["datanode", "node", "--home", "/jenkins/workspace/common/system-tests-lnl-mainnet/networkdata/testnet/data-node/node13"]}
2023-03-22T19:10:32.625Z    INFO    datanode.cfgwatcher config/watcher.go:90    config watcher started successfully {"config": "/jenkins/workspace/common/system-tests-lnl-mainnet/networkdata/testnet/data-node/node13/config/data-node/config.toml"}
2023-03-22T19:10:32.625Z    INFO    datanode.start.persistentPre    start/node_pre.go:59    vega is starting with pprof profile, this is not a recommended setting for production
2023-03-22T19:10:32.625Z    INFO    datanode.start.persistentPre    start/node_pre.go:69    Starting Vega Datanode  {"version": "v0.69.0", "version-hash": "eb97f6dbb595efe9af5671096a88a34caf919244"}
2023-03-22T19:10:32.625Z    INFO    datanode.start.persistentPre    start/node_pre.go:107   Initializing Network History
2023-03-22T19:10:32.904Z    ERROR   visor   visor/visor.go:137  Binaries executions has failed  {"error": "failed to execute binary /jenkins/workspace/common/system-tests-lnl-mainnet/networkdata/testnet/visor/visor13/current/vega [datanode node --home /jenkins/workspace/common/system-tests-lnl-mainnet/networkdata/testnet/data-node/node13]: exit status 255"}

Steps to reproduce

Manual

Steps to reproduce the behaviour manually:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Automation

Link to automation and explanation on how to run it to reproduce the problem/bug

Evidence

Logs

If applicable, add logs and/or screenshots to help explain your problem.

Additional context

Add any other context about the problem here including; system version numbers, components affected.

Definition of Done

ℹ️ Not every issue will need every item checked, however, every item on this list should be properly considered and actioned to meet the DoD.

Before Merging

After Merging

daniel1302 commented 1 year ago

The same issue is happening for a last few days:

failed to create and publish segment: failed to create snapshot: failed to get data dump metadata: failed to get database version: FATAL: terminating connection due to administrator command (SQLSTATE 57P01)
failed to initialise network history:failed to create networkHistory service:failed to create network history store:failed to create index:failed to open level db file:resource temporarily unavailable
Error: maximum number of possible restarts has been reached: failed to execute binary /jenkins/workspace/common/system-tests-lnl-mainnet/networkdata/testnet/visor/visor13/current/vega [datanode node --hom

Example log: https://jenkins.ops.vega.xyz/job/common/job/system-tests-lnl-mainnet/308/artifact/testnet/logs/vega-mainnet-nodeset-full-13-full/visor-13-with-vega-data-node.stderr-2023-04-06T03%3A17%3A34Z.log

https://jenkins.ops.vega.xyz/job/common/job/system-tests-lnl-mainnet/308/