crash on postgres replica

kestra-io / helm-charts

Apache License 2.0

35 stars 26 forks source link

crash on postgres replica #11

Closed ghost closed 1 year ago

ghost commented 1 year ago

Expected Behavior

postgresql:
  enabled: true
  primary:    
    extendedConfiguration: |
        max_connections = 2048
    persistence:
      size: 1024Gi
  readReplicas:
    replicaCount: 2
    extendedConfiguration: |
        max_connections = 2048
    persistence:
      size: 1024Gi

Should works well

Actual Behaviour

Can't find kestra-postgres dns name.

After look into helm file https://github.com/kestra-io/helm-charts/blob/8a01e1b6f2521a760d034428d9166b9d8c127065/charts/kestra/templates/_helpers.tpl#L80-L82

I think no other helm that reads replic settings.

Steps To Reproduce

No response

Environment Information

Kestra Version: 0.9.1
Helm Charts version: 0.5.1
Docker Image version: 0.9.1

tchiotludo commented 1 year ago

hey @maxmeng-oss, Just a open question on the subject? We deliver a charts with postgres to be able to quick start kestra on kubernetes, but we expect that people that need to custom tune their postgres instance to install a proper postgres with another helm charts and provide a simple connection string to kestra. There is so many use case that it could be a very long task and will not add a lot of values.

What is your feeling about that? does we need to handle it on our helm charts ?

ghost commented 1 year ago

Thanks for reply and I understand. It seems jdbc lack of readOnly postgres driver options and we have to use some LB to make it works.

tchiotludo commented 1 year ago

@maxmeng-oss jdbc or java don't provide by default read, write dispatch for queries. We have to implement it inside Kestra to make available.

Can you share some number about your current instance please? Number of flows? Number of executions, tasks executions per hour ?

We don't have seen a bottleneck with only read and write database right now, so maybe your figure will be higher than everything we seen before ?

ghost commented 1 year ago

You're right about pg is not a bottleneck of it.

It's a small cluster inside a k8s (4 nodes) for research purpose. We are using it as a scheduler for async low load job (2 flows, ~ 30 executions/h)

After setting the random_page_count in postgres it looks normal now. https://github.com/kestra-io/kestra/issues/1121