Open mattlord opened 6 days ago
Hello reviewers! :wave: Please follow this checklist when reviewing this Pull Request.
release notes (needs details)
label if users need to know about this change.-
), and have a clear help text.Jobs
should be named in order to mark it as required
.required
, the maintainer team must be notified._vt
tables and RPCs need to be backward compatible.vtctl
command output order should be stable and awk
-able.Attention: Patch coverage is 16.66667%
with 20 lines
in your changes missing coverage. Please review.
Project coverage is 67.39%. Comparing base (
216fd70
) to head (0bc59bc
).
Files with missing lines | Patch % | Lines |
---|---|---|
go/vt/vttablet/tabletserver/vstreamer/vstreamer.go | 9.09% | 20 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Description
For larger compressed transaction payloads (> ZstdInMemoryDecompressorMaxSize) we were already streaming the internal events as we decompressed the payload, but in the vstreamer we were still reading the entire contents into memory before sending them to the consumer (vplayer).
In this PR, we stream the internal contents all the way from the binlog consumer to the vstream consumer so that we do not need to hold the entire contents, which can be 10s or even 100s of GiBs, in memory all at once. As you can see in the test/demonstration below, we allocate and use DRASTICALLY less memory when processing the payloads: in this case approximately 14-18 times less.
Here's a manual test/demonstration on macOS (note that we end up with over 40 million rows in the customer table):
Results on the PR branch:
Results on the main branch:
Related Issue(s)
Checklist