Closed kaijchen closed 22 hours ago
Thank you for your contribution to Apache Doris. Don't know what should be done next? See How to process your PR.
Please clearly describe your PR:
run buildall
clang-tidy review says "All clean, LGTM! :+1:"
TeamCity be ut coverage result: Function Coverage: 38.02% (9900/26039) Line Coverage: 29.21% (82824/283546) Region Coverage: 28.34% (42529/150085) Branch Coverage: 24.90% (21558/86590) Coverage Report: http://coverage.selectdb-in.cc/coverage/c040ae01a6cbdc13de30d953d2f666b82aa887b1_c040ae01a6cbdc13de30d953d2f666b82aa887b1/report/index.html
run buildall
TeamCity be ut coverage result: Function Coverage: 38.02% (9899/26039) Line Coverage: 29.20% (82790/283546) Region Coverage: 28.34% (42528/150085) Branch Coverage: 24.90% (21559/86590) Coverage Report: http://coverage.selectdb-in.cc/coverage/c040ae01a6cbdc13de30d953d2f666b82aa887b1_c040ae01a6cbdc13de30d953d2f666b82aa887b1/report/index.html
PR approved by at least one committer and no changes requested.
PR approved by anyone and no changes requested.
What problem does this PR solve?
Related PR: #38003
Problem Summary:
38003 introduced a problem where the last sink node could report success even when close wait timeout, which may cause data loss.
Previously we made that change hoping to tolerate minority replica failure in this step. However, it turns out the last sink node could miss tablet reports from downstreams in case of close wait failure.
This PR fixes the problem by return the close_wait error immediately. The most common error in close wait is timeout, and it should not be fault tolerant on a replica basis anyways.
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)