Open WJay-tec opened 1 year ago
@WJay-tec - Thanks for the detailed steps to produce.
If I understand correctly, your service is ClusterIP?
Can you check the ips
in the ServiceImport
object? If it gets updated after deleting the pod.
spec:
ips:
- x.x.x.x
type: "ClusterSetIP"
status:
clusters:
- cluster: stag-eks-2
Yes is ClusterIP, and yes i confirm that the ips in the ServiceImport
object is updated after deleting the pod.
And i can also confirm that manually deleting the ServiceImport
after that, my curl command is able to successfully get a response.
Probably some sort of race condition happening here? @runakash
Problem
Calling
Clusterset
service endpoint after deleting a pod for that service will result in connection refused error.Step to reproduce connection refused error→
Steps to resolve the issue
Based on my current observation, it seems like coreDNS is not getting the latest pod IP and is still resolving to the old pod ip. When the
ServiceImport
is recreated, it started to work fine again probably because the coreDNS record is updated due to the recreation.Its also worth to add, that removing readinessProbe from the deployment manifest fixes the issue mentioned above (which i dont really understand how that fixes it)