Closed edisonwang closed 5 years ago
Hi @edisonwang , looks like the failure is here in the controller api code:
def _check_for_failed_events(self, namespace, labels):
"""
Request for new ReplicaSet of Deployment and search for failed events involved by that RS
Raises: KubeException when RS have events with FailedCreate reason
"""
response = self.rs.get(namespace, labels=labels)
data = response.json()
fields = {
'involvedObject.kind': 'ReplicaSet',
'involvedObject.name': data['items'][0]['metadata']['name'],
'involvedObject.namespace': namespace,
'involvedObject.uid': data['items'][0]['metadata']['uid'],
}
I noticed the top comments that Raises: KubeException when RS have events with FailedCreate reason
? Do your ReplicaSets have events that FailedCreate for some reason? This could be a reason why your deployments are failing with the above exceptions.
@edisonwang Can you check your ReplicaSets for any of the failures that could possibly happen? Here is a list of possible failures:
https://kukulinski.com/10-most-common-reasons-kubernetes-deployments-fail-part-2/
Could it be number (6) Resource Quotas or (7) Insufficient Cluster Resources...
Best way to check is to do kubectl describe rs ...
and look at the events on the replicasets when the errors are happening.
Hi Thanks for your the answer, the problem is gone after a restart but I reconfigured hephy to use S3 backend and lost the database... then I end up reinstalled the whole thing.... I also suspect it's cluster failure or events related, not necessary a deis issue, but the error message rather confusing, I'll keep an eye on this when it happens again and get back with more logs.
Hi Thanks for your the answer, the problem is gone after a restart but I reconfigured hephy to use S3 backend and lost the database... then I end up reinstalled the whole thing.... I also suspect it's cluster failure or events related, not necessary a deis issue, but the error message rather confusing, I'll keep an eye on this when it happens again and get back with more logs.
Alright, sounds good! From the exception that is thrown it looks like it could be resource limits or some other event thrown on the ReplicaSet. I'm going to close this issue for now and feel free to open it again if it reoccurs.
Got this issue from yesterday and couldnt find out the reason, attached logs below. I use self hosated Gitlab-CI to auto build and depoly to deis cluster, and everything works fine until this happened. it happens accross all my deis apps now ( tried 3, all have same problem). I tried remove this app and create a new one ( below log shows the fresh created app), as the log shows, it can success sometime but most time just error out.
Error:
Controller log: