tikv / pd

Placement driver for TiKV
Apache License 2.0
1.05k stars 719 forks source link

log: requesting log desensitization #2852

Closed Yisaer closed 4 years ago

Yisaer commented 4 years ago

Development Task

To reinforce the security in the PD, one thing we need to do is to do the log desensitization.
Here are some examples we need to hide from the logs in my view:

  1. region key in logs like following:

    log.Error("wrong range keys",zap.String("start-key", string(HexRegionKey(startKey))),zap.String("end-key", string(HexRegionKey(endKey))))
  2. etcd key and value in logs like following:

    log.Warn("kv gets too slow", zap.String("request-key", key), zap.Duration("cost", cost), zap.Error(err))
  3. store label key and value in logs like following:

    log.Warn("not found the key match with the store label",zap.Stringer("store", s.GetMeta()),zap.String("label-key", key))
  4. placement rule key and value in logs like following:

    log.Error("rule is in bad format", zap.String("rule-key", k), zap.String("rule-value", v), zap.Error(errs.ErrLoadRule.FastGenByArgs()), zap.NamedError("cause", err))

We will add a new configuration like "enable-log-desensitization"(default false). If this configuration is enabled, the sensitive information won't appear on the previous log.

Yisaer commented 4 years ago

@nolouch @disksing @rleungx @HunDunDM @lhy1024 WDYT? or any other advices?

lhy1024 commented 4 years ago

LGTM. In addition to the key, what else do we need to hide, such as topology?

disksing commented 4 years ago

Can we log sha1 instead? At least we can know if the key is changed.

Yisaer commented 4 years ago

@disksing It's ok to me. And from https://github.com/pingcap/tidb/pull/19409, I found that tidb directly replace the key by '?'. Maybe we should unify the action.

rleungx commented 4 years ago

IMO, ? is no meaning for PD.

disksing commented 4 years ago

I think use hash instead of ? can provide more info without leaking user information. But we need to inquire about the compliance.

Yisaer commented 4 years ago

After discussion with tikv/tidb group, currently we will omit the region key information if log-redact is enabled.