cashapp / spirit

Online Schema Change Tool for MySQL 8.0+
Apache License 2.0
104 stars 18 forks source link

What is this?

Spirit is a reimplementation of the schema change tool gh-ost.

It works very similar to gh-ost except:

The goal of Spirit is to apply schema changes much faster than gh-ost. This makes it unsuitable in the following scenarios:

If this is the case, gh-ost remains a fine choice.

Quick Links:

Optimizations

The following are some of the optimizations that make Spirit faster than gh-ost:

Dynamic Chunking

Rather than accept a fixed chunk size (such as 1000 rows), Spirit instead takes a target chunk time (such as 500ms). It then dynamically adjusts the chunk size to meet this target. This is both safer for very wide tables with a lot of indexes and faster for smaller tables.

500ms is quite "high" for traditional MySQL environments, but remember Spirit does not support read-replicas. This helps it copy chunks as efficiently as possible.

Ignore Key Above Watermark

As Spirit is copying rows, it keeps track of the highest key-value that either has been copied, or could be in the process of being copied. This is called the "high watermark". As rows are discovered from the binary log, they can be discarded if the key is above the high watermark. This is because once the copier reaches this point, it is guaranteed it will copy the latest version of the row.

For now, this optimization only applies well when your table has an auto_increment PRIMARY KEY. It is a lot more complicated with composite keys, or keys that could support collations (i.e. VARCHAR).

Change Row Map

As Spirit discovers rows that have been changed via the binary log, it stores them in a map. Or rather, it stores the key, and if the last operation was a DELETE or any other operation. This is called the "change row map". Periodically it then flushes the change row map by batching a large REPLACE INTO new_table .. SELECT FROM old_table and DELETE FROM new_table WHERE pk IN (..) statement.

In some workloads this can result in significant performance improvements, because updates from the binary log are merged and de-duplicated. i.e. if a row is updated 10 times, it will only be copied once.

Note: This optimization only applies if the entire PRIMARY KEY is memory comparable. If you use a VARCHAR primary key, it will use a slower queue-based approach.

Multi-threaded copy

Spirit will copy rows in multiple threads. This optimization really requires MySQL 8.0+ to make sense, which has much better support for multi-threaded replication.

While Spirit does not support read-replicas, it still tries to keep replication mostly up to date (with support for reading a replica every 2 seconds and observing lag). The replication monitor is not intended to be as high fidelity as gh-ost, and only used to ensure that DR functionality is not impacted.

Attempt Instant DDL

Spirit will attempt to use MySQL 8.0's INSTANT DDL assertion before applying the change itself. If the DDL change supports it, INSTANT DDL is a very fast operation and only requires a metadata change. Spirit also automatically detects operations that use the INPLACE algorithm but only modify metadata and executes those directly rater than using Spirit's copy mechanism.

Note: This feature has been contributed to gh-ost by the same authors of Spirit. It is disabled by default, and only in the master branch.

Resume from Checkpoint

Spirit periodically saves the progress of a schema change to an internal checkpoint table. If the migration is interrupted, it can be resumed with only about the last minute of progress lost. There are no flags required to enable this feature; it will apply automatically provided that Spirit is invoked with an identical ALTER statement and the required binary logs are still available.

When you consider that many migrations are best measured in days, this feature can save you a lot of lost work and improves the predictability of large-table schema migrations.

Performance

Our internal goal for Spirit is to be able to migrate a 10TiB table in under 5 days. We believe we are able to achieve this in most-cases, but it depends on:

For proof that it is possible, here is the final output from a migration on a 10TiB finch.xfers table on Aurora v3:

time="2023-04-21T07:08:24Z" level=info msg="apply complete: instant-ddl=false inplace-ddl=false total-chunks=926661 copy-rows-time=59h27m9.285730804s checksum-time=6h11m2.244079686s total-time=65h38m12.790047338s"

This table does include some secondary indexes, but the table was idle and no replication throttler was used. The configuration used threads=8 and target-chunk-time=2s, which is on the higher end of normal. We attempted to run a comparison with gh-ost (w/a 10K chunk-size), but canceled it after 10 days.

For a non-idle table, the performance delta is even greater. Consider the following microbench performed on a m1 mac with 10 cores and MySQL 8.0.31 using defaults:

Table/Scenario Gh-ost spirit spirit (checksum disabled)
finch.balances (800MB/1M rows), idle load 28.720s 11.197s 9.278s
finch.balances (800MB/1M rows), during bench 2:50m+ ~15-18s ~15-18s

This scenario is kind of a worse case for gh-ost since it prioritizes replication over row-copying and the benchmark never lets up. The checksum feature is not present in gh-ost, and adds about 10-20% to migration time. We typically do not recommend disabling it.

Unsupported Features

Requirements

Spirit works with the default configuration of MySQL 8.0, but checks that you have not changed the following settings:

Spirit requires an account with these privileges:

For replica throttling, Spirit requires:

GRANT SELECT on performance_schema.replication_applier_status_by_worker, performance_schema.replication_connection_status TO 'throttler';

(i.e. Replica throttling does not use SHOW REPLICA STATUS.)

Risks and Limitations

Writing a new data migration tool is scary, since bugs have real consequences (data loss).

We have also tried to balance making Spirit as fast as possible while still being safe to run on production systems that are running existing workloads. Sometimes this means spirit might venture into creating slow downs in application performance. If it does, please file an issue and help us make improvements.

Development

See DEVELOPMENT.md.