pingcap / tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://pingcap.com
Apache License 2.0
37.05k stars 5.83k forks source link

lightning/dumpling/dm/cdc/sync-diff-inspector: Support encrypting the MySQL password #30524

Open kennytm opened 3 years ago

kennytm commented 3 years ago

Feature Request

Is your feature request related to a problem? Please describe:

Currently the MySQL password is stored as plain-text in the config.toml, which some users feel uncomfortable with.

Describe the feature you'd like:

Provide some way to hide the password. Example:

# the current situation.
[tidb]
password = "Passw0rd!!"

# same as above, only base-64 encoded
[tidb]
password = { base64 = "UGFzc3cwcmQhIQ==" }

# same as above (using TOML's dotted-key feature)
[tidb]
password.base64 = "UGFzc3cwcmQhIQ=="

# read from a file 
[tidb]
password.file = "/data/secret/lightning.txt"

# read from environment
[tidb]
password.env = "TIDB_PASSWORD"

# read from file, cached in memory, so the file can be deleted after Lightning starts
# (without caching Lightning will fetch the password from the file in case of reconnect)
[tidb]
password.cached.file = "/data/secret/lightning.txt"

# encrypted using AES-256-CTR, key read from a local file in raw binary.
[tidb]
password.aes-256-ctr = { data.base64 = "zdpUpVhoJoggKw==", key.file = "/data/secret/lightning-key.bin", nonce.base64 = "XN74TC92g2MBNbzEPpxZUA==" }

# (same as above)
[tidb.password.aes-256-ctr]
data.base64 = "zdpUpVhoJoggKw=="
key.file = "/data/secret/lightning-key.bin"
nonce.base64 = "XN74TC92g2MBNbzEPpxZUA=="

# (also same as above)
[tidb]
password.aes-256-ctr.data.base64 = "zdpUpVhoJoggKw=="
password.aes-256-ctr.key.file = "/data/secret/lightning-key.bin"
password.aes-256-ctr.nonce.base64 = "XN74TC92g2MBNbzEPpxZUA=="

# can also read the encrypted password from a binary file.
[tidb]
password.aes-256-ctr = { data.file = "/data/secret/lightning.enc", key.file = "/data/secret/lightning-key.bin", nonce.base64 = "XN74TC92g2MBNbzEPpxZUA==" }

The password will always be decrypted in the Lightning process no matter which algorithm is chosen, since the MySQL protocol demands the original password for authentication.

Describe alternatives you've considered:

Don't do it. Rely on Lightning-in-SQL.

Teachability, Documentation, Adoption, Optimization:

King-Dylan commented 3 years ago

I need this feature (only base-64 encoded)

pepezzzz commented 3 years ago

Financial enterprise use lighting tools embedded in shell script to import data routinely without any interaction,and the user password stored in configuration file should be encoded by base-64 algorithm 。

kennytm commented 3 years ago

Specification

Configuration syntax (Lightning, DM, CDC, sync-diff-inspector)

In existing structural configuration (JSON, TOML, YAML), a "string" password field can be naturally extended to a dynamic variable password, which can be used to conceal the secret from the configuration.

A variable is a datatype satisfying the following JSON schema:

```json { "$schema": "https://json-schema.org/draft/draft-07/schema#", "$id": "#variable", "oneOf": [ { "comment": "plain text", "type": "string" }, { "comment": "decode variable using base64 on decrypt", "type": "object", "properties": { "base64": {"$ref": "#variable"} }, "required": ["base64"] }, { "comment": "read from file using path from variable on decrypt", "type": "object", "properties": { "file": {"$ref": "#variable"} }, "required": ["file"] }, { "comment": "read from environment using name from variable on decrypt", "type": "object", "properties": { "env": {"$ref": "#variable"} }, "required": ["env"] }, { "comment": "immediate decrypt on load and cache decrypted result in memory", "type": "object", "properties": { "cached": {"$ref": "#variable"} }, "required": ["cached"] }, { "comment": "decrypt variable by AES-256-CTR", "type": "object", "properties": { "aes-256-ctr": { "type": "object", "properties": { "data": {"$ref": "#variable"}, "key": {"$ref": "#variable"}, "nonce": {"$ref": "#variable"} }, "required": ["data", "key", "nonce"] } }, "required": ["aes-256-ctr"] } ] } ```

Variables can be cascaded, allowing us to use the same vocabulary for providing the password directly or the decryption key & nonce. This also allows us to provide some useful feature like

# read $TIDB_PASSWORD_FILE, which contains a file path, which contains the password itself.
# (this is similar to how GCP's credential file operates)
[tidb]
password.file.env = "TIDB_PASSWORD_FILE"

or some useless security theater like 🙃

[tidb]
password.base64.base64.base64.base64.base64.base64 = "Vm0xNFUxSXhWWGhXYTJSWFlURktWRlpyVWtKUFVUMDk="

TOML's dotted key feature makes it particularly easy to spell out these nested structures. Unfortunately this is not possible in YAML so you need to expand it:

target-database:
  host: 127.0.0.1
  port: 3306
  user: root
  password:
    aes-256-ctr:
      data: {base64: zdpUpVhoJoggKw==}
      key: {file: /data/secret/lightning-key.bin}
      nonce: {base64: XN74TC92g2MBNbzEPpxZUA==}

Command line syntax (Lightning, Dumpling)

We prefer providing dotted command line flags like

./dumpling -h 127.0.0.1 -P 3306 \
    -u root \
    --password.aes-256-ctr.data.base64 'zdpUpVhoJoggKw==' \
    --password.aes-256-ctr.key.file '/data/secret/lightning-key.bin' \
    --password.aes-256-ctr.nonce.base64 'XN74TC92g2MBNbzEPpxZUA=='

This, however, requires spf13/pflag#187 or spf13/pflag#199 or spf13/pflag#285 (the amount of duplicated PR shows how well maintained the pflag library is). If these PRs aren't merged or we can't switch to one of the forks, we may need to use a more conventional and ugly API like:

./dumpling -h 127.0.0.1 -P 3306 \
    -u root \
    --encrypted-password '{"aes-256-ctr":{
        "data":{"base64":"zdpUpVhoJoggKw=="},
        "key":{"file":"/data/secret/lightning-key.bin"},
        "nonce":{"base64":"XN74TC92g2MBNbzEPpxZUA=="}
    }}'

Encryption tool

There should be a tool to generate the encrypted password (maybe through tidb-lightning-ctl / br debug / dmctl / tidb-ctl)

$ ./some-password-tool encrypt base64 -f toml -r 'tidb.password'
Enter password: ••••••••••
## Warning: base64 is not an encryption, it cannot protect your password if leaked. Use at your own risk.
[tidb]
password.base64 = "UGFzc3cwcmQhIQ=="

$ ./some-password-tool encrypt base64 --password 'Passw0rd!!' -f yaml -r 'target-database.password'
## Warning: base64 is not an encryption, it cannot protect your password if leaked. Use at your own risk.
target-database:
  password:
    base64: "UGFzc3cwcmQhIQ=="

$ ./some-password-tool encrypt base64 --password 'Passw0rd!!' -f json -r ''
// Warning: base64 is not an encryption, it cannot protect your password if leaked. Use at your own risk.
{"base64":"UGFzc3cwcmQhIQ=="}

$ ./some-password-tool encrypt aes-256-ctr --password 'Passw0rd!!' --key-file ./lightning-key.bin | tee encrypted.toml
password.aes-256-ctr.data.base64 = "zdpUpVhoJoggKw=="
password.aes-256-ctr.key.file = "/data/secret/lightning-key.bin"
password.aes-256-ctr.nonce.base64 = "XN74TC92g2MBNbzEPpxZUA=="

The possible subcommands are "base64" and "aes-256-ctr".

Argument Meaning
-f, --format Output format: TOML (default), YAML, JSON, CLI
-r, --root A dot-separated key path of the root. Default to 'password'.
-p, --password The password to encrypt. Reads from stdin if empty.
-k, --key-file For aes-256-ctr only. A 32-byte file containing the encryption key.

the same tool should also be able to decrypt the password.

$ ./some-password-tool decrypt -f toml < encrypted.toml
Passw0rd!!

$ echo '{"x":{"base64":"UGFzc3cwcmQhIQ=="}}' | ./some-password-tool decrypt -f json -r x
Passw0rd!!

$ # with aes-256-ctr, losing the key file should make the decryption output garbage
$ echo -n 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA' > /data/secret/lightning-key.bin
$ ./some-password-tool decrypt < encrypted.toml
u÷  dµ�+�Z"

This tool can also be created as a static webpage, which should be the most user-friendly and easiest to use. However, there may be some conception issue about whether the password entered on a web interface will be secretly sent to PingCAP (no it won't).

1-fs8

Module path

The library will be placed on pingcap/tidb-tools for now. We may move it into pingcap/tidb as a sub-Go-module in future decisions.

kennytm commented 2 years ago

Design notes