Open wxiaomou opened 1 year ago
https://github.com/pingcap/tidb-operator/blob/master/pkg/discovery/discovery.go#L271 need to be fixed and include not only
PeerMembers
butMembers
as well.
Members
should already be included in returnPDMembers := []string{pdURL}
where pdURL
is the URL to the local Service.
And for https://github.com/pingcap/tidb-operator/blob/master/pkg/discovery/discovery.go#L260, it needs to actually connect to the pd and fetch the member instead of just construct a url from spec this gave us wrong pd url in our use case
Can you show the case with the wrong PD URL? Yeah, connecting and fetching members may be better.
Members should already be included in returnPDMembers := []string{pdURL} where pdURL is the URL to the local Service.
For example, we are running tidb in 3 k8s cluster 1a, 1b and 1e. This is from 1e. When render the TiKV start script it uses https://mussel-prod-pd:2379
when it call verifyPdendPoint which is the pdUrl in returnPDMembers := []string{pdURL}
/tikv-server --pd=https://mussel-prod-pd:2379,https://mussel-prod-pd-0.mussel-prod-pd-peer.tidb-mussel-prod.svc.us-east-1a.mussel-prod.tidb.musta.ch:2379,https://mussel-prod-pd-0.mussel-prod-pd-peer.tidb-mussel-prod.svc.us-east-1b.mussel-prod.tidb.musta.ch:2379,https://mussel-prod-pd-1.mussel-prod-pd-peer.tidb-mussel-prod.svc.us-east-1a.mussel-prod.tidb.musta.ch:2379,https://mussel-prod-pd-1.mussel-prod-pd-peer.tidb-mussel-prod.svc.us-east-1b.mussel-prod.tidb.musta.ch:2379 --advertise-addr=mussel-prod-tikv-0.mussel-prod-tikv-peer.tidb-mussel-prod.svc.us-east-1e.mussel-prod.tidb.musta.ch:20160 --addr=0.0.0.0:20160 --status-addr=0.0.0.0:20180 --advertise-status-addr=mussel-prod-tikv-0.mussel-prod-tikv-peer.tidb-mussel-prod.svc.us-east-1e.mussel-prod.tidb.musta.ch:20180 --data-dir=/var/lib/tikv --capacity=0 --config=/etc/tikv/tikv.toml
@csuzhangxc
Can you show the case with the wrong PD URL? Yeah, connecting and fetching members may be better. For example we have a heterogeneous tidb cluster which just contains tidb instance. So we have two tc in this case
- mussel-prod
- mussel-prod-heterogeneous
addr := controller.PDPeerFullyDomain(tc.Spec.Cluster.Name, tc.Spec.Cluster.Namespace, tc.Spec.Cluster.ClusterDomain)
so the tc.Spec.Cluster.Name
is mussel-prod-heterogeneous
which is wrong since PD is in tc mussel-prod
. Does this make sense to you @csuzhangxc ?
Members should already be included in returnPDMembers := []string{pdURL} where pdURL is the URL to the local Service.
For example, we are running tidb in 3 k8s cluster 1a, 1b and 1e. This is from 1e. When render the TiKV start script it uses
https://mussel-prod-pd:2379
when it call verifyPdendPoint which is the pdUrl inreturnPDMembers := []string{pdURL}
/tikv-server --pd=https://mussel-prod-pd:2379,https://mussel-prod-pd-0.mussel-prod-pd-peer.tidb-mussel-prod.svc.us-east-1a.mussel-prod.tidb.musta.ch:2379,https://mussel-prod-pd-0.mussel-prod-pd-peer.tidb-mussel-prod.svc.us-east-1b.mussel-prod.tidb.musta.ch:2379,https://mussel-prod-pd-1.mussel-prod-pd-peer.tidb-mussel-prod.svc.us-east-1a.mussel-prod.tidb.musta.ch:2379,https://mussel-prod-pd-1.mussel-prod-pd-peer.tidb-mussel-prod.svc.us-east-1b.mussel-prod.tidb.musta.ch:2379 --advertise-addr=mussel-prod-tikv-0.mussel-prod-tikv-peer.tidb-mussel-prod.svc.us-east-1e.mussel-prod.tidb.musta.ch:20160 --addr=0.0.0.0:20160 --status-addr=0.0.0.0:20180 --advertise-status-addr=mussel-prod-tikv-0.mussel-prod-tikv-peer.tidb-mussel-prod.svc.us-east-1e.mussel-prod.tidb.musta.ch:20180 --data-dir=/var/lib/tikv --capacity=0 --config=/etc/tikv/tikv.toml
@csuzhangxc
is there any problem when using https://mussel-prod-pd:2379
in 1e (for its local member)?
Can you show the case with the wrong PD URL? Yeah, connecting and fetching members may be better. For example we have a heterogeneous tidb cluster which just contains tidb instance. So we have two tc in this case
- mussel-prod
- mussel-prod-heterogeneous
addr := controller.PDPeerFullyDomain(tc.Spec.Cluster.Name, tc.Spec.Cluster.Namespace, tc.Spec.Cluster.ClusterDomain)
so the
tc.Spec.Cluster.Name
ismussel-prod-heterogeneous
which is wrong since PD is in tcmussel-prod
. Does this make sense to you @csuzhangxc ?
you mean, tc.Spec.Cluster.Name
in mussel-prod
is mussel-prod-heterogeneous
, but not tc.Spec.Cluster.Name
in mussel-prod-heterogeneous
is mussel-prod
?
if PD is in mussel-prod
:
mussel-prod
, I think tc.Heterogeneous() && tc.WithoutLocalPD()
will be for false
, and it will use pdURL
mussel-prod-heterogeneous
, it should set tc.Spec.Cluster.Name
as mussel-prod
, and then use addr := controller.PDPeerFullyDomain(tc.Spec.Cluster.Name, tc.Spec.Cluster.Namespace, tc.Spec.Cluster.ClusterDomain)
to mussel-prod
Bug Report
What version of Kubernetes are you using?
What version of TiDB Operator are you using?
What storage classes exist in the Kubernetes cluster and what are used for PD/TiKV pods?
N/A What's the status of the TiDB cluster pods?
N/A What did you do?
https://github.com/pingcap/tidb-operator/blob/master/pkg/discovery/discovery.go#L271 need to be fixed and include not only
PeerMembers
butMembers
as well.And for https://github.com/pingcap/tidb-operator/blob/master/pkg/discovery/discovery.go#L260, it needs to actually connect to the pd and fetch the member instead of just construct a url from spec
What did you expect to see?
What did you see instead? In the discovery.go when it's generating the list of PD urls it only includes the PeerMembers but not local member. We are deploying tidb across 3 k8s clusters. Does it make sense to include the local PD in the
VerifyPDEndpoint
response?Also we have Heterogeneous tidb and tikv cluster, currently discovery is generating to url from spec, this gave us wrong pd url in our use case. Does it make sense to actually connect to PD and fetch the PD member?