geerlingguy / deskpi-super6c-cluster

DEPRECATED - DeskPi Super6c 6-node Raspberry Pi CM4 Cluster
MIT License
88 stars 16 forks source link

Can't get NFS service working on Ceph cluster #1

Open geerlingguy opened 2 years ago

geerlingguy commented 2 years ago

When I try adding an NFS service to the cluster using the web dashboard, this message pops up in an overlay:

Failed to apply: [Errno 2] No such file or directory: 'ganesha-rados-grace': 'ganesha-rados-grace'

And in the logs:

Failed to apply nfs.nfs spec NFSServiceSpec.from_json(yaml.safe_load('''service_type: nfs service_id: nfs service_name: nfs.nfs placement: count: 1 hosts: - deskpi1 ''')): [Errno 2] No such file or directory: 'ganesha-rados-grace': 'ganesha-rados-grace' Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/serve.py", line 507, in _apply_all_services if self._apply_service(spec): File "/usr/share/ceph/mgr/cephadm/serve.py", line 760, in _apply_service daemon_spec = svc.prepare_create(daemon_spec) File "/usr/share/ceph/mgr/cephadm/services/nfs.py", line 66, in prepare_create daemon_spec.final_config, daemon_spec.deps = self.generate_config(daemon_spec) File "/usr/share/ceph/mgr/cephadm/services/nfs.py", line 87, in generate_config self.run_grace_tool(spec, 'add', nodeid) File "/usr/share/ceph/mgr/cephadm/services/nfs.py", line 225, in run_grace_tool timeout=10) File "/lib64/python3.6/subprocess.py", line 423, in run with Popen(*popenargs, **kwargs) as process: File "/lib64/python3.6/subprocess.py", line 729, in __init__ restore_signals, start_new_session) File "/lib64/python3.6/subprocess.py", line 1364, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'ganesha-rados-grace': 'ganesha-rados-grace'

I even tried installing the dependencies listed on Ceph's NFS documentation page, but that didn't help:

https://github.com/geerlingguy/deskpi-super6c-cluster/blob/dce4ee74d916e260a949c2507934233c719961d0/main.yml#L43-L49

harish-kp commented 2 years ago

I believe the NFS in Ceph associates itself with cephFS or object gateway (RGW)

Initial steps to setup  NFS would be

  1. An active cephFS service running.

In your setup, I believe you started off directly (from what I could infer (metadata service for cephFS or rgw service for object gateway were not active) from the video's dashboard) to NFS which might be why you were facing issues. Also systemctl status ceph-@nfs.service might give you some insights .

At the end of the video you had CephFS running, so now the logical step to setup NFS would be to remove any dead/ stale NFS services and recreating it with association to CephFS.

geerlingguy commented 2 years ago

Thanks! I'll have to take a look at this again when I have the server up and running again—right now I have it pulled apart for some other testing :(

hsalazr commented 1 year ago

I'm having the exact same issue with the nfs

harish-kp commented 1 year ago

Are the cephFS or RGW services running and respective pools have been created? You might wanna try ceph fs volume create <fs name> (root privileges preferred) This will create data and metadata pool. After which you can associate NFS to cephFS from dashboard

hsalazr commented 1 year ago

I do have a pool and the volume... I tried add NFS and I've got the same error reported by @geerlingguy ... image

I ended up just doing a kernel mount on all 4 Pi's and because is enough for my use case "having a shared/replicated/highly-available storage across all swarm nodes for docker usage"

harish-kp commented 1 year ago

Could you please get the systemctl status ceph-<fsid>-nfs.<nfs-id>.service (if the service has been created, you can view it in Dashoard at Cluster -> Services in the left panel) from the node where the nfs service is placed?

I'm trying to understand if any packages are missing specific to Raspbian or something else entirely

hsalazr commented 1 year ago

image

Failed to apply: [Errno 2] No such file or directory: 'ganesha-rados-grace': 'ganesha-rados-grace'

Id to have these packages installed on the the servers:

harish-kp commented 1 year ago

Could you please try and install nfs-ganesha-rados-grace from the debian repo nfs-ganesha-rados-grace

hsalazr commented 1 year ago

image Same result when trying to launch the service for nfs

sudo apt list |grep ganesha

nfs-ganesha-ceph/stable,now 3.4-1 arm64 [installed]
nfs-ganesha-ceph/stable 3.4-1 armhf
nfs-ganesha-doc/stable,stable 3.4-1 all
nfs-ganesha-gluster/stable 3.4-1 arm64
nfs-ganesha-gluster/stable 3.4-1 armhf
nfs-ganesha-gpfs/stable 3.4-1 arm64
nfs-ganesha-gpfs/stable 3.4-1 armhf
nfs-ganesha-mem/stable 3.4-1 arm64
nfs-ganesha-mem/stable 3.4-1 armhf
nfs-ganesha-mount-9p/stable,stable 3.4-1 all
nfs-ganesha-nullfs/stable 3.4-1 arm64
nfs-ganesha-nullfs/stable 3.4-1 armhf
nfs-ganesha-proxy/stable 3.4-1 arm64
nfs-ganesha-proxy/stable 3.4-1 armhf
nfs-ganesha-rados-grace/stable,now 3.4-1 arm64 [installed]
nfs-ganesha-rados-grace/stable 3.4-1 armhf
nfs-ganesha-rgw/stable 3.4-1 arm64
nfs-ganesha-rgw/stable 3.4-1 armhf
nfs-ganesha-vfs/stable 3.4-1 arm64
nfs-ganesha-vfs/stable 3.4-1 armhf
nfs-ganesha/stable,now 3.4-1 arm64 [installed]
nfs-ganesha/stable 3.4-1 armhf
python3-nfs-ganesha/stable,stable 3.4-1 all

Linux pi001 5.15.61-v8+ #1579 SMP PREEMPT Fri Aug 26 11:16:44 BST 2022 aarch64 GNU/Linux

ustaerk commented 1 year ago

This is due to arm builds being disabled for nfs-ganesha-stable: https://github.com/ceph/ceph-build/issues/1979

marin246 commented 1 year ago

Is there a workaround or is ceph+nfs on arm not possible at the moment?