openebs / mayastor

Dynamically provision Stateful Persistent Replicated Cluster-wide Fabric Volumes & Filesystems for Kubernetes that is provisioned from an optimized NVME SPDK backend data storage stack.
Apache License 2.0
755 stars 110 forks source link

Recover pool #1199

Closed sgerme closed 2 years ago

sgerme commented 2 years ago

Is your feature request related to a problem? Please describe. Hi, I have a cluster running Talos 0.14 on bare metal using NVMe SSDs. After a crash of one of the nodes, I re-deployed it with the same name. but the pool is not recognized anymore.

Describe the solution you'd like a way to start a normal Linux, to recover all the data. When I try to mount the disk, it is not mountable since it does not appear as a valid format.

Describe alternatives you've considered Try to recreate the pool. But this pool appears as status "unknown"

Additional context two bare metal servers, with two NVMe SSD, one for the talos system, another dedicated to mayastor. I try to recover data from this second drive.

thanks for helping !

tiagolobocastro commented 2 years ago

What version of mayastor are you running? I suspect you've hit a pool operator bug, which is causing the pool to simply appear as "unknown". Could you please run kubectl-mayastor get pools and check if the pool is also "unkown" there?

sgerme commented 2 years ago

I am running on ARM and never succeed to run mayastor binaries :( I'll try on another one.


zsh: exec format error: kubectl-mayastor
tiagolobocastro commented 2 years ago

Oh, we don't have a published arm version. You can try building it yourself from the control-plane repo from the 1 release branch, if that's what you're running: nix-build -A utils.release.linux-musl.kubectl-plugin

tiagolobocastro commented 2 years ago

@sgerme, to recreate the pool I think you can delete the CR and recreate it. If delete is stuck you may remove the finalizer - this is safe, we don't delete any replicas on CR delete, only the pool if it has no replicas known to the control-plane.

sgerme commented 2 years ago

I have finally deleted everything and recreate the pools. I've downloaded backups. Thanks anyway :)