cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.12k stars 3.81k forks source link

cli: add support for using SRV DNS records for node ip:port resolution #64439

Closed sheaffej closed 1 year ago

sheaffej commented 3 years ago

A CockroachDB enterprise customer requests consideration of adding support for finding node ip:port using DNS SRV record answers when using the cockroach sql and similar CLI commands.

This would be somewhat similar to how cockroach start would use DNS SRV answers to find node ip:port for joining. https://github.com/cockroachdb/cockroach/issues/45789

Why is this important? Customers who deploy CockroachDB on-prem using cloud abstraction/orchestration platforms like Kubernetes (k8s) or Nomad may not have k8s- or Nomad-compatible on-prem load balancer platforms available. This is because many k8s distros and Nomad defer the creation and operation of LoadBalancer services to the Cloud provider. Several customers are in the process of building this type of on-prem load balancing as-a-service (LBaaS), but this implementation is not yet straightforward.

With platforms like k8s and Nomad, CockroachDB ports are often dynamically allocated and mapped. Therefore the port from a client CLI perspective is not known (i.e. the CockroachDB pod is not reachable at 26257 from the client, and may be at a random port like 34523). A load balancing service resolves this issue since it routes from a known hostname and IP, to the dynamically allocated CockroachDB node's hostname and IP. However with many k8s distros and Nomad, these platforms rely on Cloud-vendor provided load balancing services that are not often present with on-prem deployments.

In the mean time as customers work to build LBaaS platforms, customers are handling load-balancing at the application level using various techniques. For example, applications can query a service registry and then set up their connection pools on initialization. But when using the cockroach sql CLI, the user must supply the dynamically allocated CockroachDB node hostname and IP. And these need to be looked up by the human user which is an undesireable burden. And the k8s or Nomad platforms can change the hostname and port at any time, which means this information can't reliably be scripted or pulled from environment variables.

Additional thoughts regarding the feature Preferably the CLI could support both normal (_port-name._proto-name.service-name.domain) as well as headless (service-name.domain) SRV services to look up hostname and IPs for CockroachDB nodes. If the SRV DNS response contains multiple answers, the CLI could select one of the answers and attempt to connect at the host and port. If that connection fails, the CLI could try the next answer in the response. Depending on the situation, it may not be desired for the CLI to continue trying answers until it connects, so this could be made optional via another flag. It also may be desirable to control if the CLI tries to connect via multiple SRV answers in a sequential order as they were received, or to pick randomly from the set of answers. The latter could be helpful if multiple invocations of the CLI are connecting, so they don't all connect to the same CockroachDB node and cause a connection imbalance.

Jira issue: CRDB-7058

github-actions[bot] commented 1 year ago

We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!