github / gh-ost

GitHub's Online Schema-migration Tool for MySQL
MIT License
12.4k stars 1.26k forks source link

"attempt-instant-ddl" should support lock_wait_timeout #1386

Open Hexcles opened 8 months ago

Hexcles commented 8 months ago

Currently gh-ost only sets lock_wait_timeout when doing normal cutover: https://github.com/search?q=repo%3Agithub%2Fgh-ost%20lock_wait_timeout&type=code

When Instant DDL is used, there doesn't seem to be a way to set the lock timeout. We can either reuse the same cut-over-lock-timeout-seconds flag or introduce a new one specifically for Instant DDL.

hakusaro commented 7 months ago

Peering into https://github.com/github/gh-ost/pull/1201, I found it perplexing that it says this in the docs:

This is not a problem for most scenarios, but it could be a problem for users that start the DDL during a period with long running transactions.

This is really critical information, in my opinion, because if gh-ost is supposed to be inherently safe, it seems to jeopardize this safety by potentially creating table outages with no controllable timeout here. We primarily introduced gh-ost because of long-running transactions that were hard to pin down and a lack of safety with LHM in these scenarios. While we've mostly cleaned these up, I still think anything that could remotely incur a table outage should have defined characteristics for how long the table will be out.