github / gh-ost

GitHub's Online Schema-migration Tool for MySQL
MIT License
12.43k stars 1.26k forks source link

Feature: Potentially reduce production impact while drastically reducing migration time using transportable tablespaces #261

Open grypyrg opened 8 years ago

grypyrg commented 8 years ago

As gh-ost is very flexible in where and how it can migrate a new table, we can add MySQL transportable tablespaces in the gh-ost process to not have to rebuild very big tables on the whole replication tree which would reduce impact, but also improve migration time considerably.

We could do this:

  1. MySQL Replication tree: Master, ProdSlave, NonProdSlave
  2. Start a migration, read binary logs from NonProdSlave, create new table and copy all rows in NonProdSlave and process the binary log as as necessary, but try to use as much resources as possible and do not throttle :-). (You might as variant just to an ALTER TABLE statement if it's faster)
  3. When this is finished, NonProdSlave has the new table structure and all data in it, but changes will still happen on the master.
  4. NonProdSlave: Do FLUSH TABLE .. FOR EXPORT. This can take a while and will ensure change buffer and dirty pages are merged and the tablespace is clean. Keep the lock by keeping the connection open.
  5. Master & ProdSlave: Create the empty table with SQL_LOG_BIN=0
  6. Master & ProdSlave: Copy the table.{ibd,cfg} files from NonProdSlave
  7. Master & ProdSlave: ALTER TABLE table IMPORT TABLESPACE
  8. NonProdSlave: UNLOCK TABLE
  9. Continue gh-ost magic as if it was performing the migration directly from the master and process binary logs.

This however changes the architecture of gh-ost as now only MySQL client access is necessary but copying of files have to become possible somehow.

Documentation:

Limitations:

shlomi-noach commented 8 years ago

Fantastic, thank you for this suggestion!

shlomi-noach commented 8 years ago

The drawback just hit me: this would only work on the master. Importing the tablespace on the master will have no effect on the topology tree, and we would have to repeat the import throughout the topology.

Also noting down the import onto all servers would have do take place while gh-ost freezes processing of incoming binlog events: table must be imported onto all servers in tthe exact same shape before proceeding to apply binlog changes.

grypyrg commented 8 years ago

Correct. This has to be copied to the whole topology. That might make it quite a bit more complex indeed. (orchestrator integration? :-)).

gh-ost indeed will have to freeze processing, but I see that that's one of the big strengths of gh-ost and I imagine this is fairly simple to implement.

shlomi-noach commented 8 years ago

the freezing is indeed easy. and the orchestrator integration is what it would take. I generally oppose to this kind of solution because it can only handle servers that were online during the operation. What if a server was down at that time? How do we proceed? What if a server was just restoring from backup? (We have such servers at any given time)