aylei commented 5 years ago

Bug Report

kubernetes: 1.12.6 tidb-operator: latest

What did you do?

Change the value pd.maxReplicas from 3 to 5 in values.yaml
Run helm upgrade
Waiting the rolling-update complete
Get pd config by curl <host>:2379/pd/api/v1/config

What did you expect to see? The replication.max-replicas is updated to 5.

What did you see instead? The replication.max-replicas is still 3.

According to @nolouch , PD do not change the configuration once the config file has been persisted. We may have to:

document the warning for user that the PD configuration cannot be updated using helm.. (I prefer this way)
or keep the configuration in sync via PD restful API in the tidb-opeartor control-loop (cannot be elegantly implemented when the configuration is not managed in TidbCluster CRD)

After investigating the code https://github.com/pingcap/pd/blob/master/server/leader.go#L398-L411 , the schedule configurations and replication configurations are persisted in ETCD and cannot be updated through config file.

zyguan commented 5 years ago

I thought this is a known problem. helm upgrade only update the config files for some value changes, there is no reload (or restart) action currently, thus all components (tidb, tikv) have related problems.

We also need to figure out which values (not only pd config) won't take effect after upgrade. For pd, the restful api (or directly using the code of pd-ctl) might be a solution, for tikv, I'm not sure is there any simple way to hot reload config without restart the process.

aylei commented 5 years ago

I thought this is a known problem. helm upgrade only update the config files for some value changes, there is no reload (or restart) action currently, thus all components (tidb, tikv) have related problems.

We also need to figure out which values (not only pd config) won't take effect after upgrade. For pd, the restful api (or directly using the code of pd-ctl) might be a solution, for tikv, I'm not sure is there any simple way to hot reload config without restart the process.

479 introduce rolling-updates of PD/TiKV/TiDB nodes on configuration update, 'restart' is the intended behavior.

The problem is that the scheduler and replication configuration is persisted in etcd and won't be updated after a rolling-update.

zyguan commented 5 years ago

Ah, I see, I prefer documenting it too.

pingcap / tidb-operator

PD configuration updates do not work #487

Bug Report

479 introduce rolling-updates of PD/TiKV/TiDB nodes on configuration update, 'restart' is the intended behavior.