Summary:
The cause of this issue is that after recovery from "no space" problem, the seen_error flag in the WritableFileWriter was not reset.
IMO that the seen_error flag is used to prevent frequent write retries when an error is present.
A similar situation can be referenced in SyncWalImpl, where error_recovery_in_prog is true it also been reset.
Therefore, it is acceptable to reset it in ResumeImpl.
Considering that a successful resume is required and it needs to be done before 'OnErrorRecoveryCompleted', the changes are as follow.
Test Plan:
Added a test case 'NoSpaceOnWriteWalAndRecovery' in 'db_io_failure_test.cc' to test the "no space" error and recovery when writing WAL.
Modified 'db_test_util.h' to simulate the "no space" error when appending WAL.
Fix 11643
Summary: The cause of this issue is that after recovery from "no space" problem, the seen_error flag in the WritableFileWriter was not reset. IMO that the seen_error flag is used to prevent frequent write retries when an error is present. A similar situation can be referenced in SyncWalImpl, where error_recovery_in_prog is true it also been reset. Therefore, it is acceptable to reset it in ResumeImpl. Considering that a successful resume is required and it needs to be done before 'OnErrorRecoveryCompleted', the changes are as follow.
Test Plan: Added a test case 'NoSpaceOnWriteWalAndRecovery' in 'db_io_failure_test.cc' to test the "no space" error and recovery when writing WAL. Modified 'db_test_util.h' to simulate the "no space" error when appending WAL.