canonical / postgresql-k8s-operator

A Charmed Operator for running PostgreSQL on Kubernetes
https://charmhub.io/postgresql-k8s
Apache License 2.0
10 stars 20 forks source link

No way to access to PostgreSQL from outside of k8s environment #657

Open nobuto-m opened 2 months ago

nobuto-m commented 2 months ago

Steps to reproduce

  1. juju deploy postgresql-k8s -n3 on top of a Kubernetes (let's say microk8s)

Expected behavior

There is an option to expose the PostgreSQL endpoint to a client outside of the Kubernetes.

Actual behavior

The PostgreSQL endpoint is exposed at 10.1.0.0/16 (the default microk8s Calico range). No option to expose it to outsdie of the Kubernetes nor any relation available to a reverse proxy such as Traefik.

Versions

Operating system: Ubuntu 22.04 LTS

Juju CLI: 3.5.3-genericlinux-amd64

Juju agent: 3.5.3

Charm revision: postgresql-k8s 14/stable 281

microk8s: MicroK8s v1.28.13 revision 7150

Log output

Juju debug log:

N/A

Additional context

syncronize-issues-to-jira[bot] commented 2 months ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/DPE-5265.

This message was autogenerated

taurus-forever commented 2 months ago

Dear @nobuto-m , thank you for the bugreport!

To connect the PostgreSQL K8s from outside of K8s/Juju, data-integrator charm must be used in a way: data-integrator <> pgbouncer-k8s <> postgresql-k8s (this will also create a database, user and password to access DB).

To provide the networking connectivity the K8s NodePost will be opened by pgbouncer-k8s IF it is related to data-integrator.

You can find more technical details in PR: https://github.com/canonical/pgbouncer-k8s-operator/pull/264 and the official documentation is going to be published on charmhub soon.

Can you please test it from your side? Thank you!

nobuto-m commented 2 months ago

To provide the networking connectivity the K8s NodePost will be opened by pgbouncer-k8s IF it is related to data-integrator.

I'm thinking a theoretical challenge here without testing it yet. But the NodePort sounds like missing a reliable endpoint to the database for HA since the client cannot see a single endpoint always connectable. i.e. you can always connect to the same port with NodePort but you cannot connect to the same IP address or FQDN always since there is no aid of a load balancer.

What I'm comparing it with is COS charms for example. Each service is exposed using Traefik backed by metallb.

delgod commented 2 months ago

Our performance tests have shown clear performance degradation when any sort of proxy is used between the db client and database components. I think @phvalguima can share performance testing results.

To avoid performance penalties, we recommend running few PgBouncer units on a few concrete K8s nodes and using IPs of these nodes in the connection string for client application (e.g. postgresql://k8s-node1,k8s-node2,k8s-node3:NodePort/dbname). If the user follows this recommendation, traffic received NodePort will be locally redirected to the local PgBouncer and avoid any big-latency processing. Of course, PgBouncer should be strictly bonded to the K8s nodes to make it work fast. Because it is a LIST of IPs, the connection string continues to work while PgBouncer is answering at least at one IP address.

I will be happy to have a call about this topic. Such setup is easier and faster than any LoadBalancer or Ingress.

phvalguima commented 2 months ago

@nobuto-m as @taurus @delgod pointed, there are some things here: 1) pgbouncer and mysql-router are already "balancers", as they interpret the SQL traffic and decide on which is the best backend 2) You can integrate with more than one pgbouncer at once (as @delgod described above) 3) Ingress API, in theory, only work with HTTP. In practice, they have their own controllers that extend configuration via CRDs (e.g., that is what Traefik has with its EntryPoints AFAIR)

The reasoning with NodePorts is the following: reduce as much as possible the amount of hops as we want to reduce latency. So, we want to always hit the same nodes running pgbouncer.