Open antonzabreyko opened 2 years ago
To update state on this PR, @antonzabreyko and I discussed two mitigations:
Is this still an issue?
Yeah, this is still a pretty major issue. We can close this when we migrate to the ChunkStore-free gateway but that hasn't been integrated yet. The issue is that the manager cannot keep up with high QPS with many chunks.
I can issue a quick patch that slackens the state check if it's needed before the ChunkStore is removed. Just let me know.
Let's wait until the ChunkStore free gateway code is merged (Sarah is working on this) and then @antonzabreyko should retest that code to see if the issue persists.
When running a transfer job consisting of one million files, the following error occurs:
It appears this error non-deterministically happens, as previously it occurred at roughly 300 KiB transferred.