tikv / sig-transaction

Resources for the transaction SIG
62 stars 13 forks source link

Need disable async commit and 1PC during the upgrade from TiDB < 5.0 #75

Open sticnarf opened 3 years ago

sticnarf commented 3 years ago

Async commit and 1PC are fundamental changes to the transaction protocol in TiDB. There will be new kinds of locks that need to be handled in new ways. Therefore, it is very hard for TiDB < 5.0 to handle these locks well, for example:

  1. The BatchGet API adds a new response-level error for returning the memory lock (see https://github.com/tikv/tikv/pull/9077). TiDB 5.0 will be able to read the memory lock so it can handle the case correctly. However, TiDB < 5.0 cannot see this newly added field and it requires all KV pairs to be returned in the pairs field, which is what we cannot accomplish in TiKV 5.0 with async commit.

  2. When there is one 5.0 TiDB instance while other instances are 4.0, during a rolling update, the 4.0 TiDB may read an async-commit lock prewritten from the 5.0 TiDB. However, TiDB 4.0 cannot handle these async-commit locks, it will just retry but never succeed. Then, this TiDB connection will hang for a long time and prevent a graceful shutdown during the rolling update.

cfzjywxk commented 3 years ago

/cc @coocood @youjiali1995 @MyonKeminta @nrc @lysu We could discuss here and check the related work.

sticnarf commented 3 years ago

I propose making enable-async-commit and enable-1pc global variables. These two variables are set to 1 in new 5.0 clusters but set to 0 for clusters upgraded from pre-5.0 versions.

Upgrading tools provided by us (like tidb-operator or tiup) can set these two variables to 1 after upgrading is finished.

youjiali1995 commented 3 years ago

I propose making enable-async-commit and enable-1pc global variables. These two variables are set to 1 in new 5.0 clusters but set to 0 for clusters upgraded from pre-5.0 versions.

Upgrading tools provided by us (like tidb-operator or tiup) can set these two variables to 1 after upgrading is finished.

Agree, same as my thoughts.

MyonKeminta commented 3 years ago

Upgrading tools provided by us (like tidb-operator or tiup) can set these two variables to 1 after upgrading is finished.

I think it's better if the user can choose whether to enable it after upgrading. A user who is sensitive to risks may not want to enable it after upgrading. Considering this, maybe it's better just to keep it disabled after upgrading, and the user need to enable it manually.

jackysp commented 3 years ago

Upgrading tools provided by us (like tidb-operator or tiup) can set these two variables to 1 after upgrading is finished.

I think it's better if the user can choose whether to enable it after upgrading. A user who is sensitive to risks may not want to enable it after upgrading. Considering this, maybe it's better just to keep it disabled after upgrading, and the user need to enable it manually.

Agree. I don't think users are eager enough to use the new features in the first transaction after the upgrade.