openebs / mayastor

Dynamically provision Stateful Persistent Replicated Cluster-wide Fabric Volumes & Filesystems for Kubernetes that is provisioned from an optimized NVME SPDK backend data storage stack.
Apache License 2.0
755 stars 109 forks source link

Not working on arm64 architecture #1568

Open phiilu opened 11 months ago

phiilu commented 11 months ago

Describe the bug I wanted to install mayastor via helm on my arm64 Talos 1.6.1 server, but I can't get it working. It seems like the etcd version that is used depends on an image docker.io/bitnami/bitnami-shell:11-debian-11-r63 which is not built for arm64 and therefore fails to succeed.

 ➜  ~ kubectl describe pods mayastor-etcd-0
(...)
Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  16m                  default-scheduler  Successfully assigned mayastor/mayastor-etcd-0 to n2-storage
  Normal   Pulled     15m (x5 over 16m)    kubelet            Container image "docker.io/bitnami/bitnami-shell:11-debian-11-r63" already present on machine
  Normal   Created    15m (x5 over 16m)    kubelet            Created container volume-permissions
  Normal   Started    15m (x5 over 16m)    kubelet            Started container volume-permissions
  Warning  BackOff    103s (x70 over 16m)  kubelet            Back-off restarting failed container volume-permissions in pod mayastor-etcd-0_mayastor(52798499-a0d7-42ef-acdd-5519341ed07f)
 ➜  ~ kubectl logs pods/mayastor-etcd-0 -c volume-permissions
exec /bin/bash: exec format error

To Reproduce

 helm repo add mayastor https://openebs.github.io/mayastor-extensions/
 ➜  ~ helm search repo mayastor --versions
NAME                CHART VERSION   APP VERSION DESCRIPTION
mayastor/mayastor   2.5.0           2.5.0       Mayastor Helm chart for Kubernetes
mayastor/mayastor   2.4.0           2.4.0       Mayastor Helm chart for Kubernetes
mayastor/mayastor   2.3.0           2.3.0       Mayastor Helm chart for Kubernetes
mayastor/mayastor   2.2.0           2.2.0       Mayastor Helm chart for Kubernetes
mayastor/mayastor   2.1.0           2.1.0       Mayastor Helm chart for Kubernetes
mayastor/mayastor   2.0.1           2.0.1       Mayastor Helm chart for Kubernetes
mayastor/mayastor   2.0.0           2.0.0       Mayastor Helm chart for Kubernetes

These are the helm values I used (the basePaths are specific for Talos):

# values.yaml
etcd:
  localpvScConfig:
    basePath: /var/openebs/local/{{ .Release.Name }}/etcd

loki-stack:
  localpvScConfig:
    basePath: /var/openebs/local/{{ .Release.Name }}/loki

io_engine:
  nodeSelector:
    openebs.io/engine: mayastor

nodeSelector: {}
 helm install mayastor mayastor/mayastor -n mayastor --create-namespace --version 2.5.0 --values values.yaml

Expected behavior Mayastor works on arm64 server

OS info:

Additional context

Client:
    Tag:         v1.6.0
    SHA:         eddd188c
    Built:
    Go version:  go1.21.5 X:loopvar
    OS/Arch:     darwin/arm64
Server:
    NODE:        94.XX.XX.XX (IPv4)
    Tag:         v1.6.1
    SHA:         0af17af3
    Built:
    Go version:  go1.21.5 X:loopvar
    OS/Arch:     linux/arm64
    Enabled:     RBAC
tiagolobocastro commented 11 months ago

We don't currently provide arm64 installs for mayastor, only for the kubectl plugin. This is not for a technical reason AFAIK but rather because we have no hardware available to test on arm64 :( Though IIRC there's a couple of users that build their own arm64 images from our code, maybe they can help out, I think they might have either raised issues here or on slack.

tiagolobocastro commented 10 months ago

Leaving this open as the open issue for arm64 support. This is not on the roadmap, but if some external contributor has some arm servers for us to test on, we'd be happy to consider arm64 support.

felipesere commented 9 months ago

What does testing look like here? I am about to setup a TuringPi 2 homelab with Talos and I wanted to use Mayastor for storage.

Given that I use an MacBook Air M1 I can probably also test on that in the interim?

jphastings commented 9 months ago

I have a TuringPi 2 set up (2x RK1s + 2x CM4s) with Talos (great minds @felipesere!) and I'm having this same problem.

I'm inexperienced with kubernetes, but I can learn fast & very happy to test anything other community members are able to create. (I'll be keeping an eye on this issue for others).

@tiagolobocastro, what hardware would you need to be able to support this officially? (I'm asking to see if I, or others in the TuringPi 2 + Talos community, could find a way to provide it for you).

tiagolobocastro commented 9 months ago

What does testing look like here? I am about to setup a TuringPi 2 homelab with Talos and I wanted to use Mayastor for storage.

We have a bunch of repos for the different components of mayastor, though tbh the only place where the arch would affect things would be the dataplane (this repo). Besides per repo CI we do lots of system testing on hetzner VM's (these are kubernetes clusters on VMs).

Given that I use an MacBook Air M1 I can probably also test on that in the interim?

M1 is a different kettle a fish as it's not linux, so things like udev won't work, no nvme-tcp initiator etc. Still you might be able to build on this repo, not sure. Maybe @hrudaya21 and @dsharma-dc have tried it?

tiagolobocastro commented 9 months ago

I have a TuringPi 2 set up (2x RK1s + 2x CM4s) with Talos (great minds @felipesere!) and I'm having this same problem.

I'm inexperienced with kubernetes, but I can learn fast & very happy to test anything other community members are able to create. (I'll be keeping an eye on this issue for others).

@tiagolobocastro, what hardware would you need to be able to support this officially? (I'm asking to see if I, or others in the TuringPi 2 + Talos community, could find a way to provide it for you).

We would need at least some VM's for the per-repo CI. The tricky one would be system test, as that creates a swarm of VM's and runs for many hours. I guess we could perhaps run only the ones which we thing would affect the cpu architecture, CC @avishnu

jphastings commented 8 months ago

We would need at least some VM's for the per-repo CI.

If providing this could be as simple as reaching an initial and monthly funding goal on something like open collective then please let me know, and I can see if there's enough interest in the TuringPi forums!

(Naturally any target funding amounts would need to be large enough to cover not just the first month of the VMs, but enough months to make long term maintenance viable, even as monthly donation commitments fluctuate.)

sebadob commented 8 months ago

I can confirm that its not working with current versions.

I just tried it on a Raspberry Pi + K3s.

A few tiny things that I already fixed on the fly:

Get rid of docker.io/bitnami/bitnami-shell in the volume-permissions containers for loki and etcd, since it has been deprecated officially in favor of bitnami/os-shell which provides an arm64 version again.
I also had to modify the etcd image tag, because not all of them exist for arm64. I just picked the current latest just for testing, since it exists for arm64.

Sadly, etcd is the last thing that does not start up. The pods crash loop, exiting with

ERROR ==> Headless service domain does not have an IP per initial member in the cluster

after a while. This sounds like a config / setup issue, not being architecture related though.

Apart from this, just a question on the side:
Why use nats + etcd? If nats is being used with Mayastor already, why not just use jetstream kv store from nats and don't need etcd at all? This would simplify the deployment and save resources.

edit:

Btw I just used the helm command from the docs without any further configuration:

helm install openebs openebs/openebs --namespace openebs --create-namespace --set mayastor.enabled=true

Regarding the only left etcd error, I found #1421 which is exactly about that. So I guess arm support would be fixed (so far) by just doing the above changes to the helm charts.

FriedCircuits commented 7 months ago

Any update on this? I have a mixed cluster with both x86 and TuringPi2 RK1 nodes. Suggestions by @sebadob worked great. Thanks.

Btw @sebadob What did you do to get csi node daemon set to run on ARM64? Mine are only scheduling on x86 nodes.

sebadob commented 7 months ago

Btw @sebadob What did you do to get csi node daemon set to run on ARM64? Mine are only scheduling on x86 nodes.

I don't have it running anymore, but they simply started. I did not do anything special there. This was an arm-only cluster.

But I moved away from it again because I did not like it that the Mayastor pod did busy waiting and consumed the full CPU all the time, even when the cluster was in idle. I get it that this will reduce latency when doing this sync instead of async, but my clusters are usually rather small and then the 100% cpu usage all the time is nothing I wanted.
So I cannot provide more information, sorry.

tiagolobocastro commented 7 months ago

Btw @sebadob What did you do to get csi node daemon set to run on ARM64? Mine are only scheduling on x86 nodes.

Try using --set nodeSelector={}

FriedCircuits commented 7 months ago

Btw @sebadob What did you do to get csi node daemon set to run on ARM64? Mine are only scheduling on x86 nodes.

Try using --set nodeSelector={}

Thanks but it looks like the mayastor-csi-node image doesn't have a arm64 build. https://hub.docker.com/r/openebs/mayastor-csi-node/tags

Darn, I bought some drives and was trying this out since Jiva was unreliable and has a memory leak in the NDM and it really doesn't like node restarts.

tiagolobocastro commented 7 months ago

I'm afraid we don't currently build arm64 images. If you have an arm64 system you could try building the csi-node yourself. I think there's a couple of users which build their own, maybe you can ask on slack.

tidux commented 5 months ago

Can you guys just get a Mac Mini or something if you need an ARM build host?

ThorbenJ commented 4 months ago

Leaving this open as the open issue for arm64 support. This is not on the roadmap, but if some external contributor has some arm servers for us to test on, we'd be happy to consider arm64 support.

Hi, And OrangePi 5 Plus http://www.orangepi.org/html/hardWare/computerAndMicrocontrollers/details/Orange-Pi-5-plus-32GB.html has:

Its not a big grunt server, but it only cost about USD 180 - and so spoke to me as a great home cluster option.

This not an advert, its what I got six of (plus another three 16GB RAM version for k8s control plane) and would love to run OpenEBS on it. I wanted to use Local PV LVM on the M.2 SSDs for fast local storage, and Mayastor on Large USB attacked SSDs for long term redundnat/resiliant storage (Think NextCloud with TBs of photos and videos; care more about recovery than speed)

I would be happy to donate some money so that someone could buy one or two of these, if it meant official arm64 support soon (since it appears most of the work as already be done, but one has to build their own images). Please let me know.

tiagolobocastro commented 4 months ago

That's very kind of you, thank you. I'll talk to the team about the possibility - though don't think we're geared up in any way to accept donations, sending us the hardware might be easier perhaps.