pingcap / tidb-operator

TiDB operator creates and manages TiDB clusters running in Kubernetes.
https://docs.pingcap.com/tidb-in-kubernetes/
Apache License 2.0
1.22k stars 493 forks source link

Flexible user password initialization #274

Closed gregwebs closed 5 years ago

gregwebs commented 5 years ago

Problem

Our current initializer job that sets the user password suffers from a problem:

Both the initializer job and other passwords (backup) have passwords passed through helm.

Solving Escaping

The initializer job is structured the way it is to avoid escaping issues. However, we have determined two alternatives:

Avoiding helm

All passwords should be created as secrets by the user. The manifests just reference the secret names.

gregwebs commented 5 years ago

Also, gcp.credentialsData in the backup chart should be created as a secret.

tennix commented 5 years ago

With py-mysqldb installed on alpine linux image and the initial root password stored in ROOT_PASSWORD environment, the root password can be set with the following command:

python -c "import os, MySQLdb; MySQLdb.connect(host='127.0.0.1', port=4000, user='root').cursor().execute(\"set password for 'root'@'%%' = %s; flush privileges;\", (os.environ.get('ROOT_PASSWORD'),))"

There is already a Python mysqlclient image on DockerHub: https://github.com/tnir/mysqlclient.

As for your suggestion, I prefer to make the secrets optional: If users provide secret names for the passwords, then helm would not create these secrets. But users can also set passwords and credentials in the values.yaml and let helm to create these secrets.

gregwebs commented 5 years ago

I prefer to make the secrets optional: If users provide secret names for the passwords, then helm would not create these secrets. But users can also set passwords and credentials in the values.yaml and let helm to create these secrets.

There is a down-side to maintaining two ways of doing things, particularly where one is insecure by default. Supporting insecure by default techniques means we must spend time documenting how to do things in a secure way.

The insecure by default comes in two forms

tennix commented 5 years ago

Either way we have to document how to do things in a secure way. And the secure way is complicated for new users. Using insecure way we can allow new users quickly get TiDB up and running. If advanced users want the secure way, they can find the method in the document.

gregwebs commented 5 years ago

I agree the insecure way is easy for a demo that you are going to destroy, but it is actually more complicated once you realize you have to save your password somewhere.

The secure way just requires running kubectl create secret generic password --from-literal=key=value. Its quite easy even if more verbose than adding -set tidb.password=value to your helm install.

tennix commented 5 years ago

Yes, creating it may be easy, but the secret is outside of the helm control. Users may forget about that. Besides k8s secrets are not always encrypted. There is lots of work to do to keep the credential data real safe. So to keep it really secure, we must add more document about this for example how to enable k8s encryption at rest.

gregwebs commented 5 years ago

Yes, creating it may be easy, but the secret is outside of the helm control. Users may forget about that.

Good point. Note though that if the tidb cluster is installed into a new namespace, then deleting the namespace will delete the secret.

Besides k8s secrets are not always encrypted. There is lots of work to do to keep the credential data real safe. So to keep it really secure, we must add more document about this for example how to enable k8s encryption at rest.

I think we can just rely on the existing Kubernetes secret documentation. Whereas if we put secret values through helm they must understand both K8s secrets and all of the issues of having secrets in helm.

gregwebs commented 5 years ago

For managing the secret lifecycle, I think we can still delete the secret automatically (perhaps adding a helm delete hook somewhere).

gregwebs commented 5 years ago

Related to refactoring secrets, we currently generate a random password if none is given. However, that workflow may be broken: https://github.com/helm/helm/issues/3053

gregwebs commented 5 years ago

I have an init job now that looks like this:

apiVersion: batch/v1
kind: Job
metadata:
  name: {{.Values.ClusterName}}-tidb-initializer
  labels:
    app.kubernetes.io/instance: demo 
    app.kubernetes.io/component: tidb-initializer
spec:
  backoffLimit: 1000
  template:
    metadata:
      labels:
        app.kubernetes.io/instance: demo 
        app.kubernetes.io/component: tidb-initializer
    spec:
      restartPolicy: OnFailure
      containers:
      - name: mysql-client
        image: tnir/mysqlclient
        imagePullPolicy: "IfNotPresent"
        command:
        - python
        - -c
        - |
          import os, MySQLdb
          sql="""
          SET PASSWORD FOR 'root'@'%%' = '%s';
          FLUSH PRIVILEGES;
          """ % (os.environ.get('ROOT_PASSWORD'))
          # import sys; sys.stdout.flush(); print(sql)                                                                                                                                         
          conn = MySQLdb.connect(host='127.0.0.1', port=4000, user='root')
          for line in sql.splitlines():
              if line == '': 
                  continue
              conn.cursor().execute(line)
              conn.commit()
        env:
        - name: ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: "{{ .Values.tidb.passwordSecretName }}"
              key: root 
tennix commented 5 years ago

Related to refactoring secrets, we currently generate a random password if none is given. However, that workflow may be broken: helm/helm#3053

Yes, this is the biggest drawback of current approach. I agree with you to let user create secret outside of helm and reference the secret in values.yaml.