dhiaayachi / temporal

Temporal service
https://docs.temporal.io
MIT License
0 stars 0 forks source link

Postgres connection pooling not working #262

Open dhiaayachi opened 2 weeks ago

dhiaayachi commented 2 weeks ago

Expected Behavior

Connections rarely getting opened

Actual Behavior

50% of all CPU load (4 cores) just from opening new connections

image

Steps to Reproduce the Problem

  1. cluster with 2k shards (I assume that's the culprit)
  2. use Postgres v97 plugin (actual Postgres is v14, we still have to upgrade the plugin)

Specifications

dhiaayachi commented 6 days ago

Thank you for reporting this issue! I see that you're experiencing high CPU load due to connection opening.

Based on your description, it seems like the issue might be related to the high number of shards (2k) you're using in your cluster. To confirm this, could you please share the following information:

The information you provide will help me diagnose the issue and provide more tailored suggestions.

In the meantime, I recommend you review the documentation on Worker performance: https://docs.temporal.io/develop/worker-performance

This document provides insights on how to monitor and tune your Workers to optimize performance and manage resource consumption.

dhiaayachi commented 6 days ago

Thank you for reporting this issue!

It seems like you are experiencing high CPU load due to connection opening in your Temporal cluster.

Could you provide more information about the steps you are taking to reproduce the issue? What is the actual code that you are running?

Also, could you confirm if you are using Temporal Cloud or a self-hosted Temporal cluster? This will help me understand the context and provide the best solution.

Let me know if you have any other questions.

dhiaayachi commented 6 days ago

Thank you for reporting this issue. Based on the information provided, it seems that you are experiencing high CPU utilization when opening new connections. This could be caused by the large number of shards (2k) in your cluster.

A large number of shards can lead to a significant increase in the number of connections that Temporal needs to open to interact with the database.

To troubleshoot this, please consider:

If the problem persists, please provide more information about your environment, specifically:

This will help me better understand the situation and find a solution for you.

dhiaayachi commented 6 days ago

Thank you for reporting the issue.

You're observing very high CPU load when opening new connections which is likely due to the high number of shards (2k).

Possible Solutions:

  1. Reduce the number of shards: If you have enough resources and can afford to decrease the number of shards, it will reduce the load on your cluster.

  2. Upgrade the Postgres plugin: The version mismatch between your actual Postgres (v14) and the plugin (v97) could be a contributing factor. Upgrading the plugin may resolve the issue.

  3. Increase cluster resources: To mitigate the load, increase the cluster resources, particularly CPU and memory.

  4. Optimize Postgres performance: Review the Postgres documentation for performance optimization tips. https://www.postgresql.org/docs/

  5. Contact Support: For further assistance, please contact the Temporal Support team. https://temporal.io/cloud/support

Let me know if you have any other questions.