Open maxenglander opened 1 year ago
I like the proposal. We'll let @harshit-gangal and @systay chime in with any concerns and then we can make a decision on the path forward.
Separately, I don't know if table equivalence was ever intended to work if using the same table name for the reference table as its source. At least the example in the docs uses a different name uproduct
in the sharded keyspace. But if this is the case we need to document it as a known limitation.
@deepthi fair enough I'll update my comments about table equivalence to reflect. However I think table equivalence in the example in the docs still satisfies the goal of the RFC is the same: the end user wants to query one logical table for both reads and writes, and have the query get routed appropriately.
Thank you @maxenglander for writing an elaborative RFC. Let us go for Option 1, the reason I could think of:
Few things I would like to point out.
"products": {
"type": "reference",
"reference": {
"keyspace": "unsharded_ks",
"table": "products"
}
}
type reference
and then json property as reference
might create confusion. Also, we can mode like
"products": {
"type": "reference",
"refer": "unsharded_ks.products",
}
easier to read, refer
can also point to the unqualified table, which can follow the global routing logic.
The code in the RFC regarding the join merge is v3 code, regarding the modification in Gen4 planner. The changes need to be carried out here: https://github.com/vitessio/vitess/blob/22067c7a4e334a7e90850a2119135e6b1ae23279/go/vt/vtgate/planbuilder/operators/route_planning.go#L489-L507
Thanks for the guidance @deepthi and @harshit-gangal.
I'm going to start working on option 1.
@harshit-gangal
refer can also point to the unqualified table, which can follow the global routing logic.
Is it OK with you if I don't support that initially? Maybe that can be an iteration for the future?
This can be added next after the PR or I can push a commit on the same PR.
If we are debating names, I'd like to suggest source
instead of refer
😄
Properties should be nouns, not verbs.
One more thing we should also talk about in the RFC is the automation of updating the Vschema when a materialized workflow is created. We already have similar automation for MoveTables, CreateLookupVindex, etc.
@harshit-gangal
We already have similar automation for MoveTables, CreateLookupVindex, etc.
Oh that's cool, didn't know that. Will add something to the RFC about that.
@harshit-gangal
One more thing we should also talk about in the RFC is the automation of updating the Vschema when a materialized workflow is created.
Add notes in RFC about this. For option 1 I added:
Add an option to Materialize to automatically enrich VSchema with Reference table and Source.
We need to wait till we reach a low replication lag before we can route the reference table to the materialized keyspace in the VSchema. Otherwise we will be querying inconsistent or partially copied tables. In MoveTables
and Reshard
cutover is manually done by doing a TrafficSwitch
.
Materialize currently doesn't have this concept. You will have to manually check the lag using Workflow Show
and update the VSchema once we are in sync with the source table.
One option is to
vtctl
command, like MaterializeReferenceTable
. MaterializeReferenceTable -source commerce -table product <target_keyspace>.<workflow_name> Create
This will act like a helper: we internally use Materialize and setup the correct materialize spec..MaterializeReferenceTable <target_keyspace>.<workflow_name> Cutover
ORvtorc
to monitor the lag and do the cutoverThanks @rohit-nayak-ps I'll incorporate your notes in the RFC.
After feedback from @harshit-gangal and @systay in #11875 am amending the proposal slightly.
Initially RFC intended to route joins to reference tables to the optimal keyspace only when the reference table is unqualified, e.g.:
SELECT ks2.[table] JOIN [reference_table]
This is now changed to also optimally route joins to qualified reference tables:
SELECT ks1.[global_table] JOIN ks2.[reference_table]
The query will route [reference_table]
to ks1
if there is an identically named table in k1
that references ks2.[reference_table]
.
Added a task to add a comment directive so users can override this behavior for specific queries, e.g. for queries that need to prioritize data correctness over performance (think: materialization lag).
Some more adjustments based on feedback:
I won't be able to work on this anymore, un-assigning myself.
Introduction
The goal of this RFC is to make reference tables more useful, and to find a solution for some of its current limitations. This RFC is mainly motivated by #11863. However, depending on if/how we solve the problem, it may have a broader impact.
Reference tables are intended to be helpful in situations like:
commerce.products
.customer.customers
.customers
toproducts
.Reference tables are designed to solve this by letting you:
products
table into each shard of thecustomer
keyspace wherecustomers
lives.type
reference
in thecustomers
VSchema.products
whenever it is joined tocustomers
.Problem statement
There are limitations on how helpful VTGate will be with routing.
Let's say you want to have
products
table have the same name in both the unsharded and sharded keyspace. And then you want to issue keyspace-unqualified queries like this:These queries won't work. They will produce this error: https://github.com/vitessio/vitess/blob/4ff02c0f7af35a520bfc90d32e77fc13810dfdd9/go/vt/vtgate/vindexes/vschema.go#L566
Additionally, when the queries are keyspace-qualified, Vitess will make sub-optimal routing decisions:
Even though Vitess has all the information it needs to route the entire query to
customer
, it will route it to both keyspaces.Use cases
Just to show that this isn't a contrived problem, here is a use case that describes the situation of a customer of PlanetScale:
Historical Support
It used to be possible to route queries to either the source table or reference table.
I think (but am not 100% sure) that "table equivalence" was introduced to support the kind of unqualified routing this RFC is proposing. It's not clear whether table equivalence was designed to allow admins to have reference tables with identical names as the source table, but, in any case, this feature would satisfy the end-goal: provide clients a way to interact with a single, keyspace-unqualified table identifier that maps to either source or reference tables.
Table equivalence does not currently work.
Current Support
USE [sharded_ks]
or[globally_unique_sharded_table] JOIN [sharded_ks].[reference_table]
.INSERT
into the source table, unless the routing rules are qualified with@rdonly
or@replica
.New Support
Give Vitess users a way to:
JOIN
.Proposal
Options
Option 1: expand
reference
table metadatareference.keyspace
, and that it is not a reference table or sequence.ReferencedBy []*Table
andReferenceTo *Table
, and propagate this information to other internal structure as needed https://github.com/vitessio/vitess/blob/4ff02c0f7af35a520bfc90d32e77fc13810dfdd9/go/vt/vtgate/vindexes/vschema.go#L89-L100MaterializeReferenceTable Cutover
.Option 2: make "table equivalence" work again
Table equivalence doesn't work, but if it did then you could define routing rules like this (example taken from docs):
With these routing rules in place, clients can interact with a single logical table, and VTGate will route queries for that table to the appropriate keyspace/shard.
There are probably a couple different ways we could implement this, but a reasonable approach would be to follow the path of https://github.com/vitessio/vitess/pull/4833 in a way that plays nicely with the Gen4 planner.
Add an option to Materialize to automatically enrich VSchema with Reference table and to add routing equivalence rules.
Option 3: read/write routing rules
We could instruct VTGate to route unqualified queries to one keyspace or another based on the type of statement, e.g.:
Add an option to Materialize to automatically enrich VSchema with Reference table and (maybe?) add routing rules.
Recommendation
I recommend that we:
reference
table metadata.Reasoning:
Decision
I like the RAPID model for decision making. Not sure who at Vitess would have the final decision on this RFC.
Recommend: @maxenglander Agree: @aquarapid Perform: @maxenglander Input: @harshit-gangal @systay @deepthi @rohit-nayak-ps community Decide: @harshit-gangal "Let us go for Option 1"
Tasks
Propose tackling this RFC in separate tasks (PRs):