pingcap / tidb

TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
https://pingcap.com
Apache License 2.0
37.38k stars 5.85k forks source link

Make it possible to use TiDB without experimental or deprecated features #22502

Open morgo opened 3 years ago

morgo commented 3 years ago

Feature Request

Is your feature request related to a problem? Please describe:

Currently there are a lot of experimental features in TiDB. Just from a docs search, there are 49 results.

Most of these features are behind feature flags, but others such as User-Defined Variables are not. This is an issue because the suggestion in the docs that it should not be used in production is impossible to comply with: it is used when restoring a mysqldump, connecting with certain client libraries and so on.

Describe the feature you'd like:

I would like the ability to have:

For the sake of the proposal lets call these options:

By disabling deprecated features from being permitted, I can ensure that a newly deployed application will have longevity, as upgrades should not break as incompatibilities are introduced at the next major version.

Describe alternatives you've considered:

This is an alternative to the current behavior.

Teachability, Documentation, Adoption, Migration Strategy:

The default of tidb_enable_experimental_features=ON would maintain the current behavior of feature flags. ALL could be described as similar to a "beta channel".

For tidb_enable_deprecated_features, it could be recommended to set this to OFF prior to any upgrade to a major version so the application could be validated in a lower risk environment, or the default could be OFF with users being instructed that they can turn it back on and expect the feature to only remain working like this for one major version.

This adds new configuration (regrettable), but can be used as a simplification of the current configuration. It caters for the primary use case of users that don't want to run any experimental features in production (irrespective of what they are).

siddontang commented 3 years ago

I also want to know how long has each experiment feature been available? if an experiment feature exists too long and has not been stable, I think we should investigate the reason.

morgo commented 3 years ago

@siddontang for the example I gave (user variables), the QA team confirmed it is only due to test coverage. I proposed it for Sprint 2, but it was rejected.

bb7133 commented 3 years ago

Good idea Morgan, I have 2 questions for now:

morgo commented 3 years ago

Good idea Morgan, I have 2 questions for now:

  • Are the features of TiKV/PD included in your proposal?

Not currently.

  • How can we make sure that all experimental/deprecated features can be controlled by the new options?

It would be easiest if internally this was controlled by an experiments or feature flags package. This helps make sure the semantics are followed consistently.

morgo commented 3 years ago

I would like to propose a new unified sysvar called tidb_experimental_switch. The semantics will behave similar to MySQL optimizer switch:

mysql [localhost:8024] {msandbox} ((none)) > SELECT @@optimizer_switch\G
*************************** 1. row ***************************
@@optimizer_switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on,use_invisible_indexes=off,skip_scan=on,hash_join=on,subquery_to_derived=off,prefer_ordering_index=on,hypergraph_optimizer=off,derived_condition_pushdown=on
1 row in set (0.00 sec)

mysql [localhost:8024] {msandbox} ((none)) > SET optimizer_switch="index_merge=off";
Query OK, 0 rows affected (0.00 sec)

mysql [localhost:8024] {msandbox} ((none)) > SELECT @@optimizer_switch\G
*************************** 1. row ***************************
@@optimizer_switch: index_merge=off,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on,use_invisible_indexes=off,skip_scan=on,hash_join=on,subquery_to_derived=off,prefer_ordering_index=on,hypergraph_optimizer=off,derived_condition_pushdown=on
1 row in set (0.00 sec)

The behavior extension that I would like to add, is that wildcards are also permitted. This will permit for all experiments to be enabled or disabled. i.e.

mysql [localhost:8024] {msandbox} ((none)) > SET optimizer_switch="*=off";
ERROR 1231 (42000): Variable 'optimizer_switch' can't be set to the value of '*=off'
mysql [localhost:8024] {msandbox} ((none)) > SET optimizer_switch="%=off";
ERROR 1231 (42000): Variable 'optimizer_switch' can't be set to the value of '%=off'

(I am not sure in this context which wildcard makes more sense, but it should be possible to update all experiments at once like this).

morgo commented 3 years ago

I have a demo available:

mysql> select @@tidb_experimental_feature_switch\G
*************************** 1. row ***************************
@@tidb_experimental_feature_switch: alter_placement=ON,cascades_planner=ON,change_column_type=OFF,change_multi_schema=OFF,chunk_rpc=ON,extended_stats=ON,fast_analyze=OFF,global_temporary_table=ON,index_merge=ON,parallel_apply=ON,pipelined_window_function=ON,point_get_cache=OFF,streaming=OFF,table_list_partitioning=ON,table_partitioning=ON,vectorized_expression=ON,window_functions=ON
1 row in set (0.00 sec)

mysql> SET tidb_experimental_feature_switch='change_multi_schema=ON';
Query OK, 0 rows affected (0.00 sec)

mysql> select @@tidb_experimental_feature_switch\G
*************************** 1. row ***************************
@@tidb_experimental_feature_switch: alter_placement=ON,cascades_planner=ON,change_column_type=OFF,change_multi_schema=ON,chunk_rpc=ON,extended_stats=ON,fast_analyze=OFF,global_temporary_table=ON,index_merge=ON,parallel_apply=ON,pipelined_window_function=ON,point_get_cache=OFF,streaming=OFF,table_list_partitioning=ON,table_partitioning=ON,vectorized_expression=ON,window_functions=ON
1 row in set (0.00 sec)

mysql> SET tidb_experimental_feature_switch='%=ON';
Query OK, 0 rows affected (0.00 sec)

mysql> select @@tidb_experimental_feature_switch\G
*************************** 1. row ***************************
@@tidb_experimental_feature_switch: alter_placement=ON,cascades_planner=ON,change_column_type=ON,change_multi_schema=ON,chunk_rpc=ON,extended_stats=ON,fast_analyze=ON,global_temporary_table=ON,index_merge=ON,parallel_apply=ON,pipelined_window_function=ON,point_get_cache=ON,streaming=ON,table_list_partitioning=ON,table_partitioning=ON,vectorized_expression=ON,window_functions=ON
1 row in set (0.00 sec)

mysql> SET tidb_experimental_feature_switch='%=OFF';
Query OK, 0 rows affected (0.00 sec)

mysql> select @@tidb_experimental_feature_switch\G
*************************** 1. row ***************************
@@tidb_experimental_feature_switch: alter_placement=OFF,cascades_planner=OFF,change_column_type=OFF,change_multi_schema=OFF,chunk_rpc=OFF,extended_stats=OFF,fast_analyze=OFF,global_temporary_table=OFF,index_merge=OFF,parallel_apply=OFF,pipelined_window_function=OFF,point_get_cache=OFF,streaming=OFF,table_list_partitioning=OFF,table_partitioning=OFF,vectorized_expression=OFF,window_functions=OFF
1 row in set (0.00 sec)
dveeden commented 3 years ago

I think having something like this would be good and helps people to avoid experimental or deprecated features.

It might be useful to have a tidb_enable_deprecated_features with ON/OFF/WARN. So ON would not result in warnings etc but WARN would. Warnings could be in logging or as SQL warnings or both.

For the tidb_experimental_feature_switch: We could consider JSON for this as that is easier to filter/modify/validate from applications and SQL.