openebs / mayastor

Dynamically provision Stateful Persistent Replicated Cluster-wide Fabric Volumes & Filesystems for Kubernetes that is provisioned from an optimized NVME SPDK backend data storage stack.
Apache License 2.0
755 stars 109 forks source link

maya without etcd? #1527

Open aep opened 1 year ago

aep commented 1 year ago

mayastor uses etcd which appears to be very fragile. is this mandatory? it doesnt appear to store much, so any k/v would work?

i can reliably take down the mayastor cluster by testing power failure events eventually you end up with

panic: tocommit(423) is out of range [lastIndex(206)]. Was the raft log corrupted, truncated, or lost?

which appears top require manual intervention

see also https://github.com/openebs/mayastor/issues/1318#issue-1579430899

power failure is something that i'd prefer to recover from automatically. could something like cockroachdb work?

tiagolobocastro commented 1 year ago

is this mandatory? it doesnt appear to store much, so any k/v would work?

We have an interface pstor which has currently only 1 implementation (etcd) so for the moment etcd is the option. There's a backlog item for adding a pstor service proxy which every service will connect to and this service proxy will in turn be the one talking directly to etcd. This will make it easier to add other options, other than etcd.

panic: tocommit(423) is out of range [lastIndex(206)]. Was the raft log corrupted, truncated, or lost?

Quick search indicates etcd data was deleted, are you using persistent storage for etcd?

inful commented 1 year ago

Would it not be possible to use the NATS built in K/V capability (as it seems you are already depending on NATS)?

aep commented 1 year ago

I would volunteer to contribute that! Nats is battle tested in prod here.

Unfortunately currently blocked due to other issues with maya that I haven't had time to figure out yet

tiagolobocastro commented 1 year ago

Interesting, I was not aware NATS had a builtin kv store! Does it have leases as well?

inful commented 1 year ago

k3s are currently adding nats as an alternative to etcd (see https://github.com/k3s-io/k3s/issues/7451). That work might serve as an inspiration :)

dm3ch commented 1 year ago

By the way can't be k8s api used as a KV storage to not to spin up separate KV storage?

tiagolobocastro commented 1 year ago

Not at the moment, we interact with etcd directly, which can't be done with k8s etcd.

dm3ch commented 1 year ago

Am I right to understand that it's possible in case of pstor driver creation?

Because I'm interested why to not use k8s as kv storage if mayastor always executed in k8s, where there's already working k8s API? :)

tiagolobocastro commented 1 year ago

Mayastor is actually not tied to k8s, which is why it cannot only reply on k8s api; Today each service may talk to etcd directly, so it becomes rather messy to have to change each service to know different storage types.

aep commented 1 year ago

i'm hoping to be able to share a solution with nats soon, that is directly usable for mayastor. ~1 week maybe.

in the meantime this might help with regards to leases: https://github.com/nats-io/nats-server/discussions/4803#discussioncomment-7612323

michaelbeaumont commented 2 months ago

Mayastor is actually not tied to k8s, which is why it cannot only reply on k8s api;

Right but couldn't the k8s API just be an alternative backend to pstor?

tiagolobocastro commented 1 month ago

Yes it could, it'd be one of the possible pstor implementations.