Open beingnoble03 opened 4 days ago
Hello reviewers! :wave: Please follow this checklist when reviewing this Pull Request.
release notes (needs details)
label if users need to know about this change.-
), and have a clear help text.Jobs
should be named in order to mark it as required
.required
, the maintainer team must be notified._vt
tables and RPCs need to be backward compatible.vtctl
command output order should be stable and awk
-able.Attention: Patch coverage is 87.30159%
with 8 lines
in your changes missing coverage. Please review.
Project coverage is 69.45%. Comparing base (
95f2e3e
) to head (8393dea
). Report is 17 commits behind head on main.
Files with missing lines | Patch % | Lines |
---|---|---|
go/vt/vtctl/workflow/utils.go | 85.71% | 6 Missing :warning: |
go/vt/vtctl/workflow/materializer.go | 90.47% | 2 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I am working on the failing TestVtctlMigrate
e2e test.
There is one issue already in the existing code: We get the list of tables just for one shard in getTablesInKeyspace()
However each target shard might have a different set of tables which exist. If the selected shard has the table, but other shards not, then we will end up not creating the target tables in some shards.
This extends to the new check we are adding. If a table is present in the first code then we will also check for data in the table in other shards. If the other shards don't have that table then the check for data will fail with a "table not found"
In addition, the getSchema()
call, made while trying to getTablesInKeyspace()
that we make is a heavy duty query, just to get the list of tables.
We may want to consider a different approach that fixes all the issues mentioned above:
MaterializeSettings
select 1 from <table1> limit 1
for each table on each shard. This will give us both the existence of the table (check for table not found
) and a non-empty table.READ UNCOMMITTED
For a workflow with 1000s of tables and 1000s of shards this will imply a large number of queries. We may want to add a --no-validations flag to bypass these checks in such cases. But since this is run during the Create
step and with a weak isolation level, we may be fine without the flag for now. The flag might be a fail-safe though in case of some bug in this code, as pointed out above.
@noble, I suggest you wait before coding further on this until we reach an agreement on the final approach.
This approach avoids the issue of size of query/number of joined tables in a query that we have with the union approach. So that will also have to be batched and a query with several tables being joined even with a union may have performance impacts which might reduce the effect of the order of magnitude increase in queries.
Description
This PR adds a validation check for empty tables in the target keyspace while creating MoveTables workflow.
Related Issue(s)
16817
Screenshot
Checklist
Deployment Notes