StarRocks / starrocks

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
https://starrocks.io
Apache License 2.0
9.16k stars 1.82k forks source link

[BugFix] Fix the issue where FE restart fails when creating a table containing too many tablets #53062

Open gengjun-git opened 5 days ago

gengjun-git commented 5 days ago

Why I'm doing:

Failures in serialization of log data should be thrown instead of ignored. Ignoring the error will write an empty log to bdb, causing FE startup failure.

What I'm doing:

Fix

2024-11-20 14:28:03.900+08:00 ERROR (stateChangeExecutor|79) [BDBJournalCursor.deserializeData():253] fail to read journal entity key=13159, data=<DatabaseEntry offset="0" size="2" data="50 200 "/>
java.io.EOFException: null
        at java.io.DataInputStream.readInt(DataInputStream.java:397) ~[?:?]
        at com.starrocks.common.io.Text.readString(Text.java:391) ~[starrocks-fe.jar:?]
        at com.starrocks.journal.JournalEntity.readFields(JournalEntity.java:249) ~[starrocks-fe.jar:?]
        at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:248) ~[starrocks-fe.jar:?]
        at com.starrocks.journal.bdbje.BDBJournalCursor.next(BDBJournalCursor.java:292) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:1933) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1892) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1221) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:745) ~[starrocks-fe.jar:?]
        at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?]
        at com.starrocks.common.util.Daemon.run(Daemon.java:107) ~[starrocks-fe.jar:?]
2024-11-20 14:28:03.919+08:00 WARN (stateChangeExecutor|79) [GlobalStateMgr.replayJournalInner():1954] catch exception when replaying journal, id: 13159, data: null,
com.starrocks.journal.JournalException: fail to read journal entity key=13159, data=<DatabaseEntry offset="0" size="2" data="50 200 "/>
        at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:254) ~[starrocks-fe.jar:?]
        at com.starrocks.journal.bdbje.BDBJournalCursor.next(BDBJournalCursor.java:292) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:1933) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1892) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1221) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:745) ~[starrocks-fe.jar:?]
        at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?]
        at com.starrocks.common.util.Daemon.run(Daemon.java:107) ~[starrocks-fe.jar:?]
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:397) ~[?:?]
        at com.starrocks.common.io.Text.readString(Text.java:391) ~[starrocks-fe.jar:?]
        at com.starrocks.journal.JournalEntity.readFields(JournalEntity.java:249) ~[starrocks-fe.jar:?]
        at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:248) ~[starrocks-fe.jar:?]

What type of PR is this:

Does this PR entail a change in behavior?

If yes, please specify the type of change:

Checklist:

Bugfix cherry-pick branch check:

gengjun-git commented 3 days ago

@mergifyio rebase

mergify[bot] commented 3 days ago

rebase

✅ Branch has been successfully rebased

gengjun-git commented 9 hours ago

@mergifyio rebase

mergify[bot] commented 9 hours ago

rebase

✅ Branch has been successfully rebased

sonarcloud[bot] commented 9 hours ago

Quality Gate Failed Quality Gate failed

Failed conditions
B Maintainability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

github-actions[bot] commented 8 hours ago

[Java-Extensions Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)

github-actions[bot] commented 8 hours ago

[BE Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)