Open kallisti5 opened 2 years ago
Scaling back to 1 replicas... as expected errors seen:
AttachVolume.Attach failed for volume "pvc-8d8a12e9-a0f0-40fb-aff8-1c5121be403a" : rpc error: code = Aborted desc = volume pvc-8d8a12e9-a0f0-40fb-aff8-1c5121be403a is not ready for workloads
Longhorn wasn't even aware of the fault until trying to use the detached volume
After attempting to use the data on the "unreplicated detached volume", longhorn finally realizes that the volume is faulted.
Notice the other detached volume is also in a fault condition, but longhorn won't realize it until I attempt to use it.
Longhorn assumes user does not touch the replicas. In detached state, one possible solution is checking the existence of the replicas periodically, but it still cannot detect the changes of the data in replicas by uses. On the other hand, computing the checksum periodically is not a good idea in either running or detached state. The volume-head or snapshots is modified by users or applications in the running state, and the computation also consumes computation and storage resources.
@derekbit even ignoring the replica setting aspect. If you have a deployment scaled to zero (and pods in that deployment are the only consumers), the (still fully valid pvc's) will slowly break if the kubernetes nodes are recycled since they aren't replicating any longer in a detached state.
Honestly, this was the final nail in us not using longhorn. Data loss is too easy, backup restoration is too difficult.
This is a fair concern because right now Longhorn will not do replication when the volume is detached status. This seems rather sensitive when running Longhorn on a managed K8s cluster.
Hi I was wondering if there was an ETA on when this might be addressed? We'd like to use longhorn with rancher autoscaling but autoscaling down will eventually result in data loss if detached volumes aren't replicated.
Hi I was wondering if there was an ETA on when this might be addressed? We'd like to use longhorn with rancher autoscaling but autoscaling down will eventually result in data loss if detached volumes aren't replicated.
We will see if we can do something for 1.5, but right now just added to the backlog first.
@jdbaudean consider using 2 scaling groups i.e. (fixed storage set (can be scaled up but not down), dynamic worker set (can be arbitrarily scaled))
If your provider does automatic node recycling after a time you need to ensure that the longhorn data disk is not located on the default node disk.
Instead a dedicated disk need to be attached to the recycling nodes, otherwise every time the node gets recycled your data will be gone.
I've run into this because I've got a couple of degraded longhorn volumes. The rebuild keeps failing.
I assume it's because I've got too much load in the cluster (it's k3s on 5xRPi4 nodes, for experimentation). In order to reduce the load (particularly on the degraded volumes), I scaled the affected deployments down to zero replicas. But now the replication/rebuild doesn't run.
Hi I have a blocking point with that (volume not replicated when is detached) !
Indeed if you try to drain node, for maintenance (Upgrade cluster) with a pod with a nodeselector on that node, volume is detached and will not be replicated and the node will never be drained to respect the PDB because that is the latest replicat.
What is the way to drain a node in that case ? This is a real issue, if it's a volume is in detached state, it should be sure that all replicat are healthy before stopping the replicat process !
Any ETA on this? This is actually a very serious bug and makes Longhorn not suitable for production!
I was just starting to setup a production cluster with Longhorn and now I can't use it :disappointed:
I just lost a bunch storage here because of this while upgrading my k8s version. Lucky this was just some production-like test environments and, of course, I have off-site backups so no data was actually lost but it can't be that we have to resort to DR measures because some service wasn't online during an infrastructure upgrade - and this will also happen without any upgrade if the service is down long enough for the each node holding its existing replicas to be refreshed.
Is there any update or ETA on this? As the above posters state it is a pretty serious bug.
This is about the feature "offline replica rebuilding".
We are tentatively planning for 1.8.
cc @derekbit
Thanks for keeping the work going!
Longhorn is a cool CSI, and has a massive potential in shift solving a very real-world k8s problem of RWX at smaller scales... however this one bug was enough to make us go with other options.
We can extend the v2 offline replica rebuilding to v1 in v1.8. Please see the ticket https://github.com/longhorn/longhorn/issues/8443 and https://github.com/longhorn/longhorn/blob/master/enhancements/20230616-automatic-offline-replica-rebuild.md
Describe the bug
If a volume enters a detached state, it no longer replicates.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Longhorn should maintain replicas even when volume is in a detached state, otherwise data-loss is quietly guaranteed.
Side enhancement: Longhorn should prompt users about the status of various taint-related migrations.
Log or Support bundle
As shown in the screenshots. Volume replicas were not maintained during a rolling recycle of nodes.
Environment