I recently installed sysdig on a test cluster. As it happens, it's the same cluster I run load tests on. While running sysdig I started a load test. Initially kangal controller timedout creating kubernetes resources. I increased the kubernetes client timeout.
And then the kangal controller was unable to create all of the kubernets resources on the first pass. But it succeeded on the second attempt. The error and stack trace are included.
Feb 12 09:30:50.961 kangal-controller E0212 14:30:50.108353 1 loadtest.go:472] there is a conflict with loadtest 'loadtest-coiling-lightningbug' between datastore and cache. it might be because object has been removed or modified in the datastore
Feb 12 09:30:50.961 kangal-controller Created JMeter resources
Feb 12 09:30:40.866 kangal-controller Created pods with test data
Feb 12 09:30:10.769 kangal-controller Remote custom data enabled, creating PVC
Feb 12 09:29:55.762 kangal-controller E0212 14:29:54.895207 1 loadtest.go:309] error syncing 'loadtest-coiling-lightningbug': client rate limiter Wait returned an error: context deadline exceeded, requeuing
Feb 12 09:29:55.762 kangal-controller error syncing loadtest, re-queuing
Feb 12 09:29:55.762 kangal-controller Error on creating new JMeter service
Feb 12 09:29:55.762 kangal-controller Created pods with test data
Feb 12 09:29:15.659 kangal-controller Remote custom data enabled, creating PVC
Feb 12 09:29:00.590 kangal-controller Created new namespace
I uninstalled sysdig and k8s api response time was much peppier. I'm already in touch with their support regarding the problem. Kangal controller also succeeds on its first pass. Clearly they have some work to do. But maybe kangal does as well?
Solution?
I'm not really sure what the expectation of flow control is... Should this be the exclusive province of cluster admins? Should charts offer some guidance for their apps? Should kangal include a priority level configuration and flow schema for its service account?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I recently installed sysdig on a test cluster. As it happens, it's the same cluster I run load tests on. While running sysdig I started a load test. Initially kangal controller timedout creating kubernetes resources. I increased the kubernetes client timeout.
And then the kangal controller was unable to create all of the kubernets resources on the first pass. But it succeeded on the second attempt. The error and stack trace are included.
Stack trace
Work around
I uninstalled sysdig and k8s api response time was much peppier. I'm already in touch with their support regarding the problem. Kangal controller also succeeds on its first pass. Clearly they have some work to do. But maybe kangal does as well?
Solution?
I'm not really sure what the expectation of flow control is... Should this be the exclusive province of cluster admins? Should charts offer some guidance for their apps? Should kangal include a priority level configuration and flow schema for its service account?
What do folks think?