vitessio / vitess

Vitess is a database clustering system for horizontal scaling of MySQL.
http://vitess.io
Apache License 2.0
18.67k stars 2.1k forks source link

Hibernate/Spring (and probably other cllient libs and ORMs) cannot take advantage of global routing #11940

Open maxenglander opened 1 year ago

maxenglander commented 1 year ago

TL;DR

Background

Vitess users taking advantage of sharding may naturally end up using a unsharded keyspace and one more sharded keyspaces. A common pattern is starting out initially with an unsharded keyspace, and then migrate more and more data to sharded keyspaces.

In the course of making this transition, the user will need to either teach their application code/config to work across multiple keyspaces, or else take advantage of global routing.

Global routing vs. alternatives

Global routing is not without drawbacks (doesn't support schema introspection, limitations on DDL), but in many cases is ideal. With a mere change to the connection string (remove the default keyspace), users can migrate to Vitess without changing application code or config, and continue that way as they move data from unsharded to sharded keyspaces.

Here are some of the alternatives to global routing, and their drawbacks:

Use routing rules

We can set up long-lived routing rules so that the user can connect to the unsharded keyspaced, but still interact with tables in the unsharded keyspace(s). Some drawbacks:

Maintain app-level table-to-keyspace mapping

You can tell Hibernate to send queries for specific table to specific keyspaces by configuring the JPA entity mapping via XML file or Java annotation. For example setting @Table(catalog="sharded_ks") on en entity will tell Hibernate to look in sharded_ks when working with that entity. Some drawbacks:

Human error factor

In both of the alternatives above, there is a human error factor that comes in to play. When an admin is required to manage routing rules or an engineer is required to maintain special entity-to-keyspace mappings, sooner or later someone will make a mistake.

From this perspective, it's better if the engineer can just use global routing rules and be done with it.

Limitations on global routing

Global routing requires that you do not set a default connection. Unfortunately, tools like Hibernate/Spring require that you do set a default keyspace. This pretty much makes global routing unusable in Spring/Hibernate. It would be great if we could get global routing to work in Spring/Hibernate, to give users a seamless migration path from other systems to sharded Vitess.

The use @primary trick

Instead of making global routing work for Spring/Hibernate, we could use this (admittedly pretty good) trick. However, some drawbacks:

A variation of this trick is to tell Hibernate/Spring to use @primary in the connection string, e.g.:

spring.datasource.url=jdbc:mysql://localhost:33306/@primary

This is so close, but falls apart when Hibernate/Spring tries to perform any kind of DDL validation at startup (ddl-auto=validate), and as part of this validation makes queries like this:

show full tables from `@primary` like '%'

These will fail with:

Unknown database '@primary' in vschema
harshit-gangal commented 1 year ago

With https://github.com/vitessio/vitess/pull/11938 how do you solve the problem with ddl-auto=validate as this will result in partial tables only when query show full tables from <dbname> like '%' is executed.

maxenglander commented 1 year ago

Yes to solve that problem we will need more pieces in place e.g. https://github.com/vitessio/vitess/pull/11882 which I can revive if we decide to go forward with #11938.

harshit-gangal commented 1 year ago

We can support this request with a keyspace alias where 2 keyspaces (unsharded and sharded) represent a single keyspace though physically they are 2 different databases. This definition can live inside the VSchema and during planning, the Vitess planner can check the presence of a table in all the alias keyspaces of the given keyspace.