longhorn / longhorn

Cloud-Native distributed storage built on and for Kubernetes
https://longhorn.io
Apache License 2.0
6.04k stars 595 forks source link

[BUG] volume attaching forever #2845

Closed liyimeng closed 3 years ago

liyimeng commented 3 years ago

Describe the bug I have longhorn running in my home lab for some weeks, suddenly my volumes failed to attach, here is one example:

pvc-48cf2ba0-77c1-4a7b-a52d-d2ec877187c2-e-b45b8d04 | Engine | longhorn-engine-controller | Warning | 9 hours ago | 4 minutes ago | FailedStarting | Error starting pvc-48cf2ba0-77c1-4a7b-a52d-d2ec877187c2-e-b45b8d04: failed to start process: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 10.42.2.200:8500: connect: cannot assign requested address"

I have no clue what to look into :(

To Reproduce don't know Expected behavior A clear and concise description of what you expected to happen.

Log If applicable, add the Longhorn managers' log when the issue happens.

You can also attach a Support Bundle here. You can generate a Support Bundle using the link at the footer of the Longhorn UI.

Environment:

Additional context Add any other context about the problem here.

joshimoo commented 3 years ago

ref #2778 #2818

jenting commented 3 years ago

ref #2778 #2818

From the log, it's related to these two. we'll have a release to fix this sooner.

The current workaround is to restart the longhorn-manager Pods. kubectl rollout restart ds/longhorn-manager -n longhorn-system

innobead commented 3 years ago

Duplicated of #2778 and #2818. Closing.