Recently, scalardb.transfer$read_all_with_retry often failed when there are many records to be lazy-recovered more than the max retry (8). To mitigate the failure, this PR increases the max retry. But just increasing it causes too long wait duration. For instance, increasing the max retry up to 10 will result in 1024 seconds wait. So, this PR also introduces an upper limit of wait duration (32 seconds) since too long retry duration doesn't make sense basically.
With the current retry logic and the max retry, the total wait duration until timeout is 510 seconds.
(Actually if the max retry is 19, the total wait duration is 510 seconds as same as the original one. But 19 seems a bit weird to me, and I set it to 20. But I don't have a strong opinion on it.)
Introduced an upper limit for the retry in scalardb test
Increased the max retry up to 20
Updated the unit test
Checklist
The following is a best-effort checklist. If any items in this checklist are not applicable to this PR or are dependent on other, unmerged PRs, please still mark the checkboxes after you have read and understood each item.
[x] I have commented my code, particularly in hard-to-understand areas.
[x] I have updated the documentation to reflect the changes.
[x] Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
[x] Tests (unit, integration, etc.) have been added for the changes.
[x] My changes generate no new warnings.
[x] Any dependent changes in other PRs have been merged and published.
Description
Recently,
scalardb.transfer$read_all_with_retry
often failed when there are many records to be lazy-recovered more than the max retry (8). To mitigate the failure, this PR increases the max retry. But just increasing it causes too long wait duration. For instance, increasing the max retry up to 10 will result in 1024 seconds wait. So, this PR also introduces an upper limit of wait duration (32 seconds) since too long retry duration doesn't make sense basically.With the current retry logic and the max retry, the total wait duration until timeout is 510 seconds.
So, this PR increases the max retry to 20 so that total wait duration (542 seconds) is similar to the original one
(Actually if the max retry is 19, the total wait duration is 510 seconds as same as the original one. But 19 seems a bit weird to me, and I set it to 20. But I don't have a strong opinion on it.)
Related issues and/or PRs
https://github.com/scalar-labs/scalar-jepsen/pull/97
Changes made
scalardb
testChecklist
Additional notes (optional)
None