facebook / mysql-5.6

Facebook's branch of the Oracle MySQL database. This includes MyRocks.
http://myrocks.io
Other
2.48k stars 712 forks source link

Make blackhole engine transactional #1276

Closed alanliang closed 1 year ago

alanliang commented 1 year ago

Summary:

The main motivation for this diff is that we want to be able to set super_read_only and read_only=1 very quickly. Normally, when there are ongoing transactions happening, setting the read only vars can time out. This is problematic for promotions in both semi-sync and raft.

The timeout is caused by long running transactions holding on to the read lock. In 2016, we introduced a sysvar legacy_global_read_lockmode in MySQL 5.6 ([Change SET GLOBAL [super]read_only to not acquire the global read lock](https://github.com/facebook/mysql-5.6/pull/1008/files)) to allow us to skip acquiring the read lock when turning on read only variables. It would skip the read lock if legacy_global_read_lock_mode=0 This would mean that turning read only would be instant and any ongoing transactions would eventually die at commit.

However, it was deemed not safe to skip the read lock because DDL statements could still be completed despite read only vars being turned on. Instead, the team decided to implement the strategy of killing of conflicting connections (Add kill_conflicting_connections option for admin user) after read only is turned on to mitigate the long running transactions problem. It is only a mitigation because after killing the transactions, we still have to wait for the rollback before the read lock is released and this can be slow and still result in a timeout.

In MySQL 8.0, transactional DDL was supported. This is great because DDL statements would be treated the same way as DML statements. So if read-only was enabled and there were an ongoing DDL statement, it will fail at commit. This means that we can safely utilize legacy_global_read_lock_mode=0 and skip the read lock when setting read only vars. However, the catch is that in MySQL 8.0, DML's in non-transactional engines like Blackhole engine, do not fail when legacy_global_read_lock_mode=0 and read-only is turned on. Specifically, when read-only is turned on, in-flight DMLs do not get blocked at commit time and are able to go through. This means that DML statements can be lost since they are seen as committed, despite read-only being on.

To address this, we can make Blackhole engine transactional.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: https://phabricator.intern.facebook.com/D43785777

alanliang commented 1 year ago
  • START TRANSACTION WITH CONSISTENT SNAPSHOT - does blackhole_hton need start_consistent_snapshot which would also call trans_register_ha?

In all places where start_consistent_snapshot() is called, there's a check to see whether the function pointer is defined first. For blackhole, which doesn't store data, it doesn't make sense to implement consistent snapshots. START TRANSACTION WITH CONSISTENT SNAPSHOT will act as a normal transaction. It just won't create a snapshot for blackhole engine.

  • Blackhole DML in START TRANSACTION WITH SHARED INNODB|ROCKSDB SNAPSHOT transactions (blackhole_hton::start_shared_snapshot)?

Similar to previous, there's a check for the function pointer before it's called. Again, it doesn't make sense to implement shared snapshots. I added tests to make sure it fails properly.

  • Blackhole DML under LOCK TABLES (ha_blackhole::start_stmt). Source code comments say that external_lock will not be called in this case.

This is a good catch. I added ha_blackhole::start_stmt and added tests. Before adding this, commit while read-only lock is set would just work. With ha_blackhole::start_stmt implemented, commit will now fail.

Interestingly this work should require removing HA_NO_TRANSACTIONS from ha_blackhole::table_flags but it's already absent there.

I noticed that too! However, when I issue a SHOW ENGINES, it correctly says that BLACKHOLE now supports transactions.

alanliang commented 1 year ago

I recently added a plugin sysvar so that we can enable/disable it. It will default to be disabled for safety.

alanliang commented 1 year ago

Already committed: https://github.com/facebook/mysql-5.6/commit/d1d23e7c7ca2deb6012cd12545993fb29ea01bfc