github / gh-ost

GitHub's Online Schema-migration Tool for MySQL
MIT License
12.43k stars 1.26k forks source link

FR Speed up the binary log application #454

Open aadant opened 7 years ago

aadant commented 7 years ago

I noticed that gh-ost runtime is sensitive to its physical location. For example, if you use --test-on-replica, the gh-ost executable needs to be located on the slave for best performance. On a WAN, it can make a 1 to 100 difference. It can be the reason gh-ost will never catch up and the migration will last forever.

The reason is simple :

when looking at the general log, there are a lot of repeated statements like this on the same table :


259010 Prepare  replace /* gh-ost `ghost`.`_t_gho` */ into
                                `_ghost`.`_t_gho`
                                values
                                        (?,, ?, ?)

259010 Execute  replace /* gh-ost `ghost`.`t_gho` */ into ...
259010 Close stmt

A single event created 3 network round trips. This can obviously be optimized.

If 2 consecutive events are hitting the same table and are of the same kind (insert, update), then they can be added at the end of the replace, up to a certain number in mysqldump fashion. The most efficient would be not to use prepared statement (that could be an option) and just generate a large statement ( 1 network round trip) or execute several statements per prepare if the next event is compatible ...

Using bulk statements or repeated execute would tremendously improve performance if the table is being written to when doing the schema change.

aadant commented 7 years ago

@shlomi-noach : I think this should address https://www.percona.com/blog/2017/07/12/gh-ost-benchmark-against-pt-online-schema-change-performance/

shlomi-noach commented 7 years ago

@aadant thank you for this suggestion!

A few thoughts:

With regard https://www.percona.com/blog/2017/07/12/gh-ost-benchmark-against-pt-online-schema-change-performance/ it's worth noting that the test was executed all-on-the-same-box ; single master node with gh-ost running from same box. So network is unlikely to be a big issue. I'd still be very curious to see the impact of non-prepared statements on this benchmark; but first a few days of testing are in order.