Closed kkaempf closed 2 weeks ago
Waiting for customer feedback.
It is still waiting for customer feedback, solution was provided we could consider to close it
I have asked about providing output from following commands:
kubectl get clusters.management.cattle.io -A
kubectl get clusters.provisioning.cattle.io -A
kubectl get eksclusterconfigs.eks.cattle.io -A
kubectl get clusters.management.cattle.io -A -o yaml
kubectl get clusters.provisioning.cattle.io -A -o yaml
kubectl get eksclusterconfigs.eks.cattle.io -A -o yaml
kubectl logs -n cattle-system eks-operator-ID
We have got confirmation that it can be closed
SURE-8366
Issue description:
Customer is reporting that the eks-operator is constantly sending DeleteCluster calls to the AWS API to delete clusters that have already been deleted (from Rancher). They would restart Rancher, but it continues every 150 seconds.
Business impact:
This isn't causing workload outages, but it's more of an annoyance for them.
Troubleshooting steps:
We had a few calls trying to find where the requests were coming from, and we found that in the ekscc object, some mentions of clusters were present, but in some of the newer logs, the original clusters they were concerned about were no longer present.
Actual behavior:
The cluster in question was removed from Rancher and EKS. However rancher continues to send requests to delete it.
Expected behavior:
When removing the cluster from Rancher, the cluster should be deleted, and cluster deleting messages should not be sent to AWS.
Files, logs, traces:
(See JIRA)
Additional notes:
It's important to note that they are doing some odd things with the permissions on the AWS side "For security reasons" that we couldn't get more explanation on. That's why we were seeing those errors in the AWS logs.
See SURE-8366 for the rest of the logs & the impacted cluster list