Open tirsen opened 5 years ago
When I think about this some more I'm starting to think maybe it would be better to just have some form of general "metadata" storage that is accessible through the MySQL protocol. The you could solve this at the application/library level instead.
Not sure about the syntax but something but I was just thinking a key-value map per keyspace with support for setting and getting data through vtgate SQL statements.
I like this. I was beginning to think that some of the workflows I'm building will require additional metadata. So, a generic key-value table-like interface should work great. I'm wondering if we could have it as a logical table like information_schema.vitess_metadata
.
This sounds good to me too.
I wonder if it makes more sense as part of
information_schema.vitess_metadata
or instead an analogous
vitess_information_schema.metadata
? If we did the latter, we could
expose other information there, like the shard topology, keyspaces,
the current vschema contents (as an alternative to SHOW VSCHEMA
),
etc.
Basically... all the things we added to SHOW
could also be exposed
through this.
On Mon, Aug 12, 2019 at 5:08 PM Sugu Sougoumarane notifications@github.com wrote:
I like this. I was beginning to think that some of the workflows I'm building will require additional metadata. So, a generic key-value table-like interface should work great. I'm wondering if we could have it as a logical table like information_schema.vitess_metadata.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
One idea I had was to just hijack the SET
syntax. I.e. something like:
SET @@vitess_metadata.<my key> = ''
But yeah I admit that is a bit lame but it would be extremely easy to implement :-)
A table interface would be one step up for sure but I worry about implementation cost. How complex would that be to build?
Is the information_schema
hard standardized? Is it unadvised to add additional tables there?
We're going with the SET
syntax for now.
We now have
ALTER VSCHEMA
which is awesome.On top of that I propose we build a schema migration management system. A schema migration system is a way to structure your schema as a sequence of versioned alterations from the empty schema. This means if you're deploying a version of your app that expects your schema to be at version
N
but in production you have only applied migrations up to versionN-X
then you can simply apply migrations(N-X+1, N-X+2, ..., N)
to achieve the schema required for the new version of the app.Schema migration tools are pretty much standard for applications which use databases with schemas these days: https://en.wikipedia.org/wiki/Schema_migration https://martinfowler.com/articles/evodb.html Java: https://flywaydb.org/getstarted/java Ruby on Rails: https://edgeguides.rubyonrails.org/active_record_migrations.html PHP: https://laravel.com/docs/5.8/migrations Django: https://docs.djangoproject.com/en/2.2/topics/migrations/
I propose we store vschema migration state in the topology server and add a few commands to vtctld to query current vschema migration state and idempotently apply vschema migrations.
We should also extend
vttestserver
such that it acceptsALTER VSCHEMA
files as migrations in theschema
directory.This is not a complete schema migration management system but it's a good start. Application code can extend this for completion. For example:
To preempt a few questions:
Why not just build all of this in application code? First of all it would be nice if Vitess provides a standard way to do this. Secondly we do want to store the migration state in the topology server and that's not so easy to do for application code. We also want to lock the vschema for concurrent alterations which again is not so easy for application code.
Why do we want to store the vschema migration state in the topology server? Storing the migration state in the same medium as the thing it alters is a good idea since it has the same lifecycle of backups, deletion and so on. Where else would we store it? In one of the databases shards? Which one? What if the database is restored from backup? Now the vschema migration state is out of sync with the vschema. Storing the state in the topology server solves all of this.