Closed runningman84 closed 2 years ago
This issue has been mentioned on Sensu Community. There might be relevant details there:
https://discourse.sensu.io/t/sensu-classic-vs-sensu-go-scalability/2438/1
It seems etcd was overwhelmed with the load put onto the cluster, putting the cluster into an unrecoverable crash-loop. The postgres store is a feature designed to address the scaling issues around etcd, perhaps spin up a postgres instance and try to reproduce the issue once again?
Does the oss version support postgres datastore at all? The etcd cluster running in k8s did not throw any error.
I do not think that the sensu-go free version with the 100 entity limit would suffer from any problem...
An OSS build does not contain the Postgres store.
An OSS build does not contain the Postgres store.
We tried scaling etcd to 7 nodes (using the embedded etcd install) and this issue still persisted with 12k events / 600 entities.
We recently found and fixed an issue where Sensu was crashing and deadlocking on its way there. That issue has now been patched, so overloaded Sensu instances should crash properly instead of deadlocking.
For context, here is the papertrail https://github.com/sensu/sensu-go/issues/4461
Expected Behavior
The sensu backend should run fine for days...
Current Behavior
After a few days of operation, the sensu backend does not expose the api on port 8080 anymore. The logs look like this: https://pastebin.com/FjQ6Psc2
Possible Solution
Steps to Reproduce (for bugs)
Context
We have an existing sensu classic env with a lot of clients and we try to migrate them to sensu go oss.
Your Environment