Pods are crasing - Githubissues

venu-ibex-9 commented 3 years ago

I am getting this below issue, can anyone help me with this ?

I0906 11:22:39.366127 1 scale_up.go:288] Pod sas-cas-control-86b98f77c5-58zs2 can't be scheduled on sas-test-eks-default20210904073839227900000022, predicate checking error: node(s) didn't match Pod's node affinity; predicateName=NodeAffinity; reasons: node(s) didn't match Pod's node affinity; debugInfo= I0906 11:22:39.366160 1 scale_up.go:288] Pod sas-report-execution-cb79b77f5-wc5g6 can't be scheduled on sas-test-eks-default20210904073839227900000022, predicate checking error: node(s) didn't match Pod's node affinity; predicateName=NodeAffinity; reasons: node(s) didn't match Pod's node affinity; debugInfo= I0906 11:22:39.366191 1 scale_up.go:288] Pod sas-analytics-services-648bcdd75c-vjjkh can't be scheduled on sas-test-eks-default20210904073839227900000022, predicate checking error: node(s) didn't match Pod's node affinity; predicateName=NodeAffinity; reasons: node(s) didn't match Pod's node affinity; debugInfo= I0906 11:22:39.366221 1 scale_up.go:288] Pod sas-environment-manager-app-777c954694-677xh can't be scheduled on sas-test-eks-default20210904073839227900000022, predicate checking error: node(s) didn't match Pod's node affinity; predicateName=NodeAffinity; reasons: node(s) didn't match Pod's node affinity; debugInfo=

I am also getting this below issue, I am assuming it's due to low instance size, please confirm ?

e group min size reached I0906 11:22:39.372131 1 pre_filtering_processor.go:66] Skipping ip-192-168-97-161.ec2.internal - node group min size reached I0906 11:22:39.372172 1 scale_down.go:423] Node ip-192-168-37-22.ec2.internal is not suitable for removal - memory utilization too big (0.896766) I0906 11:22:39.373612 1 scale_down.go:423] Node ip-192-168-59-139.ec2.internal is not suitable for removal - memory utilization too big (0.908980) I0906 11:22:39.373660 1 scale_down.go:423] Node ip-192-168-103-249.ec2.internal is not suitable for removal - memory utilization too big (0.932484) I0906 11:22:39.373680 1 scale_down.go:423] Node ip-192-168-64-129.ec2.internal is not suitable for removal - memory utilization too big (0.999073) I0906 11:22:39.373711 1 scale_down.go:423] Node ip-192-168-7-5.ec2.internal is not suitable for removal - memory utilization too big (0.949596) I0906 11:22:39.373742 1 scale_down.go:423] Node ip-192-168-7-192.ec2.internal is not suitable for removal - memory utilization too big (0.931517) I0906 11:22:39.373770 1 scale_down.go:423] Node ip-192-168-61-114.ec2.internal is not suitable for removal - memory utilization too big (0.925616) I0906 11:22:39.373801 1 scale_down.go:423] Node ip-192-168-76-100.ec2.internal is not suitable for removal - memory utilization too big (0.954067) I0906 11:22:39.373817 1 scale_down.go:488] Scale-down calculation: ignoring 2 nodes unremovable in the last 5m0s I0906 11:22:39.373861 1 static_autoscaler.go:503] Scale down status: unneededOnly=false lastScaleUpTime=2021-09-04 17:54:54.722674816 +0000 UTC m=+21279.121544295 lastScaleDownDeleteTime=2021-09-04 12:00:18.617194618 +0000 UTC m=+3.016064074 lastScaleDownFailTime=2021-09-04 12:00:18.617194701 +0000 UTC m=+3.016064161 scaleDownForbidden=false isDeleteInProgress=false scaleDownInCooldown=false I0906 11:22:39.373892 1 static_autoscaler.go:516] Starting scale down I0906 11:22:39.373961 1 scale_down.go:868] No candidates for scale down

enderm commented 3 years ago

You are not giving much information here. I also do not see a connection to this particular github projec.t Looks like the deployment process itself was successful. So the errors you are seeing are post-deployment. This looks more like a question for SAS Tech Support.

Going forward, there are numerous guides out there on how to write good github issues, e.g. https://github.com/codeforamerica/howto/blob/master/Good-GitHub-Issues.md

When in doubt, use common sense:

describe the steps you did that lead up to the error, including variable inputs
describe the error, including where the error messages appear
describe what you tried to remedy the problem, including documentation you referenced

I have found very often that going through the steps of describing an issue so that others can understand it will actually lead me to a solution, or to obvious further questions that I can investigate that might eventually lead to an answer.

That said, the error messages you are seeing point to a possible issue with the taints on your nodepools, and also a possible issue with nodepool sizing. So those are good area to look into further.

venu-ibex-9 commented 3 years ago

Sure I got it, next time will follow the above Github link to raise an issue, We are reaching to tech support

sassoftware / viya4-deployment

Pods are crasing #142