XRPLF / rippled

Decentralized cryptocurrency blockchain daemon implementing the XRP Ledger protocol in C++
https://xrpl.org
ISC License
4.5k stars 1.46k forks source link

Rippled 2.2.1 is crashing from 13/Aug #5099

Closed jerrybb closed 12 hours ago

jerrybb commented 3 weeks ago

Hi, our rippled is crashing constantly. Issue first time occured on 13/Aug at around 2pm CEST time.

Aug 22 14:50:59 xrp1 rippled[2136685]: 2024-Aug-22 12:50:59.148017732 UTC Application:NFO Process starting: rippled-2.2.1, Ins>
Aug 22 14:50:59 xrp1 rippled[2136685]: 2024-Aug-22 12:50:59.134365526 UTC LedgerConsensus:NFO Consensus engine started (cookie>
Aug 22 14:50:59 xrp1 rippled[2136685]: 2024-Aug-22 12:50:59.130173414 UTC JobQueue:NFO Using 14  threads
Aug 22 14:50:59 xrp1 systemd[1]: Started Ripple Daemon.
Aug 22 14:50:59 xrp1 systemd[1]: Stopped Ripple Daemon.
Aug 22 14:50:59 xrp1 systemd[1]: rippled.service: Scheduled restart job, restart counter is at 37.
Aug 22 14:50:58 xrp1 systemd[1]: rippled.service: Failed with result 'signal'.
Aug 22 14:50:58 xrp1 systemd[1]: rippled.service: Main process exited, code=killed, status=6/ABRT
Aug 22 14:50:58 xrp1 rippled[2136234]:   what():  File too large [system:27]
Aug 22 14:50:58 xrp1 rippled[2136234]: terminate called after throwing an instance of 'boost::system::system_error'

I also notices the follwing warning messages like these:

2024-Aug-22 12:28:03.363245329 UTC Peer:WRN [003] onReadMessage from n9KAcqjRuntoTe3SEGFNZfB1AgQ7nYDKXyv4TuBgsqzXw2zGfFm4 at 78.46.40.207:57772: Connection reset by peer
2024-Aug-22 12:28:31.212785838 UTC Peer:WRN [004] onReadMessage from n9MmwgaEpdHwriw1qbU488yhiovbBRtg4GtxuGy5kDTCgY7gWQEr at 54.82.200.23:24890: stream truncated
2024-Aug-22 11:53:25.920757254 UTC InboundLedger:ERR Received bad node data: File too large [system:27]

Server state is changing from connected and disconnected states.

Most of the rippled calls throws internal error

# /opt/ripple/bin/rippled validators

Loading: "/etc/opt/ripple/rippled.cfg"
2024-Aug-22 13:03:02.052003622 UTC HTTPClient:NFO Connecting to 10.13.9.2:5005

{
   "error" : "internal",
   "error_code" : 73,
   "error_message" : "Internal error.",
   "error_what" : "couldn't parse reply from server"

validators.txt is up to date. curl https://vl.xrplf.org and curl https://vl.ripple.com fetches data successfully.

There is plenty of free disk space (>>TB).

Anyone has some idea?

jerrybb commented 3 weeks ago

We found out we hit ext4 max file size limit 16TB @ 4k block size. -rw-r--r-- 1 rippled rippled 16T Aug 22 15:28 nudb.dat There is no option to split nudb database into smaller files, is it?

WietseWind commented 12 hours ago

@jerrybb No way to split it, only to keep less history. Best use XFS or another filesystem without the file size limit.