There is a period all rest apis have high response time (in seconds).
It's observed that there is very uneven CPU load on the three citus worker nodes, with two at 2.5 cpu cores and shard2 at 7.6 hitting the resource limits. There are also very frequent db connection establishing and tearing down logs for user mirror_rest. Most of those sessions appear to be short lived and suspicious.
The number of mirror_rest user connections are low from the two coordinator nodes pg_stat_activity table, around 15 and 19, respectively.
Description
There is a period all rest apis have high response time (in seconds).
It's observed that there is very uneven CPU load on the three citus worker nodes, with two at 2.5 cpu cores and shard2 at 7.6 hitting the resource limits. There are also very frequent db connection establishing and tearing down logs for user
mirror_rest
. Most of those sessions appear to be short lived and suspicious.The number of
mirror_rest
user connections are low from the two coordinator nodespg_stat_activity
table, around 15 and 19, respectively.There is a lot of such error logs in pgbouncer:
pgbouncer stats may give us more insights however it's hard to get.
At the end, restarting the pgbouncer container seems to fix the issue though the root cause is still unknown.
Grafana dashboard
Steps to reproduce
Check the description
Additional context
No response
Hedera network
other
Version
v0.110.0-SNAPSHOT
Operating system
None