tikv / pd

Placement driver for TiKV
Apache License 2.0
1.04k stars 718 forks source link

dashboard fails to log in to TiDB instance behind a load balancer in tidb-operator #2375

Closed kolbe closed 4 years ago

kolbe commented 4 years ago

Please answer these questions before submitting your issue. Thanks!

  1. What did you do? If possible, provide a recipe for reproducing the error.

I have a TiDB cluster running in tidb-operator on AWS/EKS. I used kubectl to forward port 2379 from my local host to the dashboard-hosting pd in the cluster. TLS is enabled both between TiDB Server and its MySQL-compatible clients and TLS is enabled for internal cluster communications. I added the correct TLS CA and client certificate to my web browser to allow me to load the dashboard login page.

  1. What did you expect to see?

I should be able to authenticate.

  1. What did you see instead?

It seems that the dashboard is connecting to a hostname for a specific TiDB server pod, but the hostname it uses to connect is not part of the SAN of the TLS certificate presented by the TiDB server (because all pod names cannot be known at the time that the certificate is generated). Instead, the certificate includes DNS names for the k8s service. As a result, it seems that PD cannot verify the TiDB identity of the server it's connecting to, so it refuses to continue the connection.

The error in the web browser is Sign in failed: error.api.user.signin.other, and further investigation in the developer console shows this response:

"authenticate failed, cause: error.api.user.signin.other: x509: certificate is valid for kolbe-test-cluster-tidb, kolbe-test-cluster-tidb.tidb-cluster, kolbe-test-cluster-tidb.tidb-cluster.svc, *.kolbe-test-cluster-tidb, *.kolbe-test-cluster-tidb.tidb-cluster, *.kolbe-test-cluster-tidb.tidb-cluster.svc, not kolbe-test-cluster-tidb-0.kolbe-test-cluster-tidb-peer.tidb-cluster.svc"
  1. What version of PD are you using (pd-server -V)?
Release Version: v4.0.0-rc-29-geb9e209b
Git Commit Hash: eb9e209bf970a75987cc1a79bcdfec939b33be7f
Git Branch: release-4.0
UTC Build Time:  2020-04-22 04:24:41
kolbe commented 4 years ago

OK, this is addressed by documentation to add additional DNS suffixes to the certificate SAN.

Fixed by https://github.com/pingcap/docs-tidb-operator/pull/215