timescale / helm-charts

Configuration and Documentation to run TimescaleDB in your Kubernetes cluster
Apache License 2.0
263 stars 223 forks source link

[ISSUE] Primary service does not have a pod selector #394

Open fungiboletus opened 2 years ago

fungiboletus commented 2 years ago

Describe the bug Primary service does not have a pod selector when the loadBalancer is disabled.

To Reproduce

Deploy timescaledb-single with loadBalancer.enabled to false. Try to use the service, by doing a port forward for example.

Expected behavior

The service should have a selector.

Deployment

Deployment

N/A

Logs

N/A

Additional context

Discussed here: https://github.com/timescale/timescaledb-kubernetes/issues/278

dubiousdavid commented 2 years ago

All that is need is to add the following under service.primary.spec in this file https://github.com/timescale/timescaledb-kubernetes/blob/master/charts/timescaledb-single/templates/svc-timescaledb.yaml:

selector:
    app: {{ template "timescaledb.fullname" . }}
    cluster-name: {{ template "clusterName" . }}
    role: master
feikesteenbergen commented 2 years ago

The primary service is designed to not have a service selector, as patroni fully manages the underlying Endpoints.

See also:

https://github.com/timescale/timescaledb-kubernetes/issues/384

fungiboletus commented 2 years ago

Why is a service created then?

dubiousdavid commented 2 years ago

Agreed with the above. A service with no pods doesn't make sense to me.

graveland commented 2 years ago

https://kubernetes.io/docs/concepts/services-networking/service/#services-without-selectors

Patroni is managing the endpoints, so the service is still usable, but patroni can use its better-understanding of the PostgreSQL cluster to make sure the service is always pointing to the current primary database.

agronholm commented 2 years ago

I have forked the official timescaledb chart here: https://github.com/agronholm/timescaledb-kubernetes See the installation instructions there. This is one of the issues the fork seeks to resolve. Let me know if this resolved the problem for you.

feikesteenbergen commented 2 years ago

Agreed with the above. A service with no pods doesn't make sense to me.

The Service has pods, they just happen to be configured by Patroni instead of a labelselector.

agronholm commented 2 years ago

A service like that cannot be used with kubectl port-forward, so connecting to the db with external tools is only possible if the service is exposed outside of the cluster via load balancer or node port. Fixing this issue is trivial, and I've done that in my fork of this chart.

feikesteenbergen commented 2 years ago

@agronholm thank you for your input, however the labelSelector approach is slower than the patroni managed enpoints approach. The fix you implement therefore works for your usecase, but has the downside of slowing down reconnects after failover.

This ofcourse is a tradeoff, this Helm Chart prefers fast failovers.

agronholm commented 2 years ago

I can only speak for myself obviously, but being told that I'm not allowed to connect to my database from outside the cluster is a showstopper issue. So I think this should at the very least be an option?

fungiboletus commented 2 years ago

I now understand the lack of selectors, and I think it makes sense. Perhaps it could be explained in the README.md that you cannot do port-forward.

Also, in my case I use the single deployment so fast failovers are not really needed. Is patroni a must for single deployments ?

feikesteenbergen commented 2 years ago

but being told that I'm not allowed to connect to my database

I'm not telling you you're not allowed to connect, actually, we built this Helm Chart to allow you to work with your database in Kubernetes.

@agronholm the service works fine through a LoadBalancer. What doesn't work is the port-forward to the service, as it lacks a label selector.

3 possible approaches to solve this problem (there may be more):

  1. Change the implementation of the service to use a selector. Downside: Slower failovers
  2. Create an additional service for port-forwarding purposes.
  3. Use kubectl port-forward to the primary pod instead of the service, for example:
    kubectl port-forward $(kubectl get pod -l cluster-name=mycluster,role=master -o name) 6432:5432

This Helm Chart would like to keep faster failovers, which means 1 is not considered. 2 could be added to the Chart and 3 can already be used today.

feikesteenbergen commented 2 years ago

@fungiboletus

Is patroni a must for single deployments

Technically, no, however this Helm Chart works towards allowing you to scale up seamlessly. If we'd change all implementations when replicas=1, we'd have to rewrite many parts of the Chart. Second, if you'd then move from replicas=1 to replicas=2, some objects would need to be redefined, I'm not sure what impact that would have on all moving parts.

Work is underway in the patroni project to work with statically configured topologies (including the single-pod deployment) which may be able to address this.

agronholm commented 2 years ago

There must be something I don't understand. Why does the mere presence of labelSelector cause slowness in failovers? Does this not affect the replica service (which does have labelSelector)? At any rate, optionally adding a label selector should be easy to add as an option. I could make a PR to that end.

feikesteenbergen commented 2 years ago

https://v1-22.docs.kubernetes.io/docs/reference/generated/kubernetes-api/v1.22/#servicespec-v1-core:

selector: Route service traffic to pods with label keys and values matching this selector. If empty or not present, the service is assumed to have an external process managing its endpoints, which Kubernetes will not modify. Only applies to types ClusterIP, NodePort, and LoadBalancer. Ignored if type is ExternalName. More info: https://kubernetes.io/docs/concepts/services-networking/service/

The key part being:

If empty or not present, the service is assumed to have an external process managing its endpoints

Where the external process == patroni

Patroni immediately updates the endpoints used by the service after failovers. The selector only updates after a while, it is not triggered by a failover.

feikesteenbergen commented 2 years ago

Does this not affect the replica service

The replica service its endpoints are not managed by Patroni and therefore use a selector.

agronholm commented 2 years ago

Ok, thanks for taking the time to explain. I think I understand the issue a lot better now.

fungiboletus commented 2 years ago

3. Use kubectl port-forward to the primary pod instead of the service, for example:

    kubectl port-forward $(kubectl get pod -l cluster-name=mycluster,role=master -o name) 6432:5432

Is there an easy way to know the credentials for this? I only find patroni related passwords in the secrets. I can override the password like this:

kubectl exec -it $(kubectl get pod -l cluster-name=mycluster,role=master -o name) -- psql 
\password

But it's not a very kubernetes way of doing things.

agronholm commented 2 years ago

What credentials do you mean?

fungiboletus commented 2 years ago

What credentials do you mean?

The credentials to connect to the database.

agronholm commented 2 years ago

You can use the postgres username and the password from PATRONI_SUPERUSER_PASSWORD.