nerc-project / operations

Issues related to the operation of the NERC OpenShift environment
2 stars 0 forks source link

Implement solution for RWX storage for Red Hat ET/InstructLab group #737

Closed larsks closed 1 month ago

larsks commented 2 months ago

The Red Hat emerging technology group working with InstructLab requires RWX storage to support their workflows. Absent support for this from NESE, we need to implement a short-term solution, which could be:

The nfs-ganesha solution would be nice because (a) it would require no additional hardware and (b) it could take advantage of the NESE storage allocation associated with that cluster.

There are several parts to this feature:

jtriley commented 2 months ago

I'm looking into getting this running on ocp-test as a possible solution. We might need to adjust the setup there but at least it's a starting point:

https://github.com/kwkoo/openshift-nfs-server

larsks commented 2 months ago

@jtriley it looks like the repository to which you've linked already incorporates the nfs-subdir provisioner, so :+1: for that.

jtriley commented 2 months ago

I'm pulling that container definition out to its own repo so we can just pull that image vs building:

https://github.com/nerc-project/nerc-nfs-server

After that I'll create an nfs-rwx bundle in nerc-ocp-config with those manifests and add it to the nerc-ocp-test overlay.

jtriley commented 2 months ago

Still sorting out a couple of manual steps needed in addition to the manifests in the PR but this is looking promising (on nerc-ocp-test):

$ oc get pvc -n default
NAME   STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS          AGE
nfs    Bound    pvc-2caa0347-9e9c-4269-b779-9bff61fbdd96   1Mi        RWX            managed-nfs-storage   12m
schwesig commented 2 months ago

FYI, NFS ideas and tutorial from Operate First https://github.com/search?q=org%3Aoperate-first%20nfs&type=code https://github.com/operate-first/apps/blob/34402be8f720c55cc59f59ac1ec5ae92226bfaa7/nfs-server-and-provisioner/README.md?plain=1#L28

/CC @hpdempsey

joachimweyl commented 1 month ago

short-term solution in place.