medic / cht-sync

Data synchronization between CouchDB and PostgreSQL for the purpose of analytics.
GNU General Public License v3.0
2 stars 2 forks source link

Create Postgres instance using EKS #87

Open njuguna-n opened 2 months ago

njuguna-n commented 2 months ago

Create a postgres instance for use with the BRAC work as suggested here. The current RDS instance is full and requires the support of the infra team to debug and get logs.

witash commented 2 months ago

@Hareet there's a draft helm chart for cht-sync here which includes a postgres instance. But how can we allow access to it? I was using port forwarding locally but do not have permissions to on EKS, and anyway will not work for dashboards externally

witash commented 2 months ago

@Hareet there's a draft helm chart for cht-sync here which includes a postgres instance. But how can we allow access to it? I was using port forwarding locally but do not have permissions to on EKS, and anyway will not work for dashboards externally

I see EKS will create an internet facing ELB if you request a LoadBalancer service, but its not accepting TCP connections

apiVersion: v1
kind: Service
metadata:
  name: postgres
  annotations:
      service.beta.kubernetes.io/aws-load-balancer-type: external
      service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
      service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
spec:
  type: LoadBalancer
  selector: 
    inner.service: postgres
  ports:
    - protocol: TCP
      port: {{ .Values.postgres.port | default "5432" }}
      targetPort: 5432
witash commented 2 months ago

nvm, it is working

apiVersion: v1
kind: Service
metadata:
  name: postgres
  annotations:
      service.beta.kubernetes.io/aws-load-balancer-type: external
      service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
      service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
      service.beta.kubernetes.io/aws-load-balancer-backend-protocol: TCP
spec:
  type: LoadBalancer
  selector:
    inner.service: postgres
  ports:
    - protocol: TCP
      port: {{ .Values.postgres.port | default "5432" }}
      targetPort: 5432
njuguna-n commented 2 months ago

I got postgres in EKS working thanks to @witash and the work he has done here. There are a few things to fix like creating the default schema and table automatically but that can be addressed in the PR. I will close this as done.

Hareet commented 2 months ago

@witash @njuguna-n

Sorry i didn't catch this last week.

A few notes and minor fix requests ->

It should be a quick fix, I'll let you all try it first and feel free to ping me on Slack if you get any errors or stuck.

witash commented 2 months ago

the load balancer from cht-core ingress is an application load balancer, which seems to be only for http traffic. what we are trying to do here is expose the postgres DB, so it needs to be plain tcp.

is there an exiting network load balancer that we could attach to instead?

maybe we could just use a node port, but are node IPs going to be static enough that will be reliable?

Hareet commented 1 month ago

@witash Yup, you are correct. Application Load Balancer won't allow TCP, but Network Load Balancer would have worked for both situations.

unrelated question, is this postgres meant to be accessible by public internet? Is that the vision for any project using cht-sync + postgres?

And I may have missed the discussion somewhere, but ssh-tunneling to an internal network that is running postgres won't fit this use-case?

witash commented 1 month ago

@Hareet for dev dbs they basically need to be accessible by public internet, for product and app developers to test and develop models. for production dbs, ideally yes there would be some network restrictions. practically it might be difficult to add restrictions while still allowing access from external dahsboard things, e.g. klipfolio

ssh-tunneling would be fine, but how would that work? where would ssh be running and how would that have access to a postgres pod running in kubernetes?

Hareet commented 1 month ago

ssh-tunneling would be fine, but how would that work? where would ssh be running and how would that have access to a postgres pod running in kubernetes?

We run a few auto-ssh-tunnels to the old RDBMS (huge postgres server) already. SSH exposed on the postgres server, but we could allow ssh to a pod that is can connect to postgres through internal kubernetse service resource names, or we could terminate tls at postgres. I skimmed the link to the helm-chart for cht-sync that was in a comment above and couldnt find any tls config. Is the latter completed or being considered? It would solve the production db's question.

And I didn't mean to keep this issue open, we can close. The load balancer pricing is also much lower than expected, likely due to metered usage.

njuguna-n commented 1 month ago

@Hareet @witash drawing attention to this conversation again. I have a ticket for adding TLS to the postgres pod but ran into a blocker. I currently need to connect Klipfolio to the Postgres running in my EKS namespace for a proof of concept dashboard. What is the best(fastest and most secure) way to do that?