When we restore a file with the builtin engine and one of the files fails to be restored (for whatever reason), we do not cancel all other ongoing file restore process. This can lead to very long waiting time until the restore properly fails even though we encountered the error a while ago.
We should be able to cancel all ongoing operations and fail fast as soon as we detect the error.
Observe the logs. They are somewhat similar to what we see on Grafana, the file 172 fails but the other file keep being processed, until it fails with the same error (mismatch sha).
Despite the error on the fourth line of the logs, files 172, 34, and 1 keep getting restored. We only see the error when the last opened file is done restoring (Completed restoring 34).
Overview of the Issue
When we restore a file with the builtin engine and one of the files fails to be restored (for whatever reason), we do not cancel all other ongoing file restore process. This can lead to very long waiting time until the restore properly fails even though we encountered the error a while ago.
We should be able to cancel all ongoing operations and fail fast as soon as we detect the error.
Reproduction Steps
Vttablet flags
Vtctld flags
Run the local examples
Insert a decent amount of data in each table. I have
50000
rows in product,10000
in customer,100000
in corder.Take a backup
Corrupt one of the backup file, so that it generates a
premature end
error with zstdDo a restore
Observe the logs. They are somewhat similar to what we see on Grafana, the file 172 fails but the other file keep being processed, until it fails with the same error (mismatch sha).
Despite the error on the fourth line of the logs, files
172
,34
, and1
keep getting restored. We only see the error when the last opened file is done restoring (Completed restoring 34
).Binary Version
Operating System and Environment details
Log Fragments
No response