openmessaging / dledger

A raft-based java library for building high-available, high-durable, strong-consistent commitlog.
Apache License 2.0
809 stars 319 forks source link

Recover data file failed,code=414,name=DISK_ERROR,desc=] pos 0 != 222296456325 #285

Open jeffchanjunwei opened 1 year ago

jeffchanjunwei commented 1 year ago

Version:dledger-0.2.2

Rocketmq:4.8.0 dledger mode

Issue:Shutdown brokers(through “shutdown broker” command,on 1 master and 2 slave node). Then start the broker progress again. The role of one broker changes from slave to master. But the request is processed with error "system busy, start flow control for a while". In the broker log file, find the log bellow:

2023-03-23 21:14:27 INFO main - load /home/rocketmq/data/rmqstore/dledger-n1/data/00000000215822106624 OK
2023-03-23 21:14:27 INFO main - load /home/rocketmq/data/rmqstore/dledger-n1/data/00000000216895848448 OK
2023-03-23 21:14:27 INFO main - load /home/rocketmq/data/rmqstore/dledger-n1/data/00000000217969590272 OK
2023-03-23 21:14:27 INFO main - load /home/rocketmq/data/rmqstore/dledger-n1/data/00000000219043332096 OK
2023-03-23 21:14:27 INFO main - load /home/rocketmq/data/rmqstore/dledger-n1/data/00000000220117073920 OK
2023-03-23 21:14:27 INFO main - load /home/rocketmq/data/rmqstore/dledger-n1/data/00000000221190815744 OK
2023-03-23 21:14:27 INFO main - load /home/rocketmq/data/rmqstore/dledger-n1/data/00000000222264557568 OK
2023-03-23 21:14:27 INFO main - load /home/rocketmq/data/rmqstore/dledger-n1/index/00000000005872025600 OK
2023-03-23 21:14:27 INFO main - load /home/rocketmq/data/rmqstore/dledger-n1/index/00000000006039797760 OK
2023-03-23 21:14:27 INFO main - load /home/rocketmq/data/rmqstore/dledger-n1/index/00000000006207569920 OK
2023-03-23 21:14:27 INFO main - Begin to recover data from entryIndex=192316284 fileIndex=4 fileSize=7 fileName=/home/rocketmq/data/rmqstore/dledger-n1/data/00000000220117073920 
2023-03-23 21:14:28 INFO main - Trying to recover data file /home/rocketmq/data/rmqstore/dledger-n1/data/00000000221190815744
2023-03-23 21:14:29 INFO main - Trying to recover data file /home/rocketmq/data/rmqstore/dledger-n1/data/00000000222264557568
2023-03-23 21:14:29 INFO main - Recover data file to the end of /home/rocketmq/data/rmqstore/dledger-n1/data/00000000222264557568 
io.openmessaging.storage.dledger.exception.DLedgerException: [code=414,name=DISK_ERROR,desc=] pos 0 != 222296456325
    at io.openmessaging.storage.dledger.utils.PreConditions.check(PreConditions.java:40) ~[dledger-0.2.2.jar:na]
    at io.openmessaging.storage.dledger.store.file.DLedgerMmapFileStore.recover(DLedgerMmapFileStore.java:215) ~[dledger-0.2.2.jar:na]
    at org.apache.rocketmq.store.dledger.DLedgerCommitLog.recover(DLedgerCommitLog.java:266) [rocketmq-store-4.8.0.jar:4.8.0]
    at org.apache.rocketmq.store.dledger.DLedgerCommitLog.recoverNormally(DLedgerCommitLog.java:318) [rocketmq-store-4.8.0.jar:4.8.0]
    at org.apache.rocketmq.store.DefaultMessageStore.recover(DefaultMessageStore.java:1419) [rocketmq-store-4.8.0.jar:4.8.0]
    at org.apache.rocketmq.store.DefaultMessageStore.load(DefaultMessageStore.java:204) [rocketmq-store-4.8.0.jar:4.8.0]
    at org.apache.rocketmq.broker.BrokerController.initialize(BrokerController.java:261) [rocketmq-broker-4.8.0.jar:4.8.0]
    at org.apache.rocketmq.broker.BrokerStartup.createBrokerController(BrokerStartup.java:222) [rocketmq-broker-4.8.0.jar:4.8.0]
    at org.apache.rocketmq.broker.BrokerStartup.main(BrokerStartup.java:58) [rocketmq-broker-4.8.0.jar:4.8.0]
jeffchanjunwei commented 1 year ago

days later(set the broker permission to readonly), download the commitlog file 00000000222264557568 and check the file with the same method(PreConditions.check), it check susssefully.

jeffchanjunwei commented 1 year ago

@vongosling @RongtongJin do u have any ideas?