Open jonasdkhansen opened 5 years ago
Connection reset by peer
is pretty benign, it just means that the remote connection was closed. If you're not seeing any issues in your application it should be fine to ignore.
Connection refused
is a little more concerning. If you're not using readiness probes, it suggests that a new pod started up and began receiving traffic before it was ready. Otherwise, there can be some staleness in the discovery for endpoints from your api-server. I've not seen any issues around that with GKE though ...
If you're particularly concerned, inject your workloads with --enable-debug-sidecar
and watch the logs. That'll do tshark and provide some added insight into what's happening.
We need to add this information to the docs somewhere
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
Our Kubernetes cluster also get the same error: Connection reset by peer (os error 104)
Linkerd Grafana shows the success rate is sometimes down a few percent.
Is there a way to fix the error, or just ignore it?
Thanks for the message @xiaowheat
Have a look at the logs of the service that Linkerd is proxying connections for. If you see errors there, then we can explore those. If there are no errors or unexpected behavior from the service, then you can probably ignore the Connection reset by peer
errors
Bug Report
What is the issue?
I see a lot of connection reset by peer errors, from the proxy log, on all my services. I'm kind of stuck in finding the issue, so i hope someone can point me in the right direction. It dosen't seem to affect the traffic, but the success rate in Linkerd is sometimes down a few percent.
How can it be reproduced?
No i am not able to reproduce the issue, by doing a lot of requests to the service. The problem only shows in the logs.
Logs, error output, etc
ERR! [ 11082.146455s] proxy={server=in listen=0.0.0.0:4143 remote=bla.bla.bla:35314} linkerd2_proxy::app::errors unexpected error: connection error: Connection reset by peer (os error 104)
And this one:
ERR! [ 87.959649s] proxy={server=in listen=0.0.0.0:4143 remote=bla.bla.bla:50416} linkerd2_proxy::app::errors unexpected error: error trying to connect: Connection refused (os error 111) (address: 127.0.0.1:8080)
linkerd check
outputkubernetes-api
√ can initialize the client
√ can query the Kubernetes API
kubernetes-version
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version
linkerd-config
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ control plane PodSecurityPolicies exist
linkerd-existence
√ 'linkerd-config' config map exists √ control plane replica sets are ready √ no unschedulable pods √ controller pod is running √ can initialize the client √ can query the control plane API
linkerd-api
√ control plane pods are ready √ control plane self-check √ [kubernetes] control plane can talk to Kubernetes √ [prometheus] control plane can talk to Prometheus √ no invalid service profiles
linkerd-version
√ can determine the latest version ‼ cli is up-to-date is running version 2.4.0 but the latest stable version is 2.5.0 see https://linkerd.io/checks/#l5d-version-cli for hints
control-plane-version
‼ control plane is up-to-date is running version 2.4.0 but the latest stable version is 2.5.0 see https://linkerd.io/checks/#l5d-version-control for hints √ control plane and cli versions match
Status check results are √
Environment