ocp-power-automation / ocp4-upi-powervm

OpenShift on IBM PowerVM servers managed using PowerVC
Apache License 2.0
27 stars 51 forks source link

OCP 4.8 nfs-provisioner permissions applied to wrong name space #247

Open dannert opened 2 years ago

dannert commented 2 years ago

Deployed OCP 4.8 on POWER.

Fixed the incorrect Cinder api version from v2 to v3 before deploy to work with PowerVC 2.0.2 - see other open issue.

After deploy NFS provisioning does not work correctly and PV is not automatically created when creating a PVC --> POD image-registry-xxx fails as PVC is created, but PV is not.

Issue is that permissions are assigned into "default" name space for nfs-provisioner instead of the nfs-provisioner name space.

To manually fix after deploy I ran: oc adm policy add-scc-to-user hostmount-anyuid system:serviceaccount:nfs-provisioner:nfs-client-provisioner

After that change, PVs are correctly created and bound and the POD image-registry-xxx runs correctly and PV shows up in NFS export directory on bastion.

aishwaryabk commented 2 years ago

I have deployed a cluster using the existing automation that has "openstack_blockstorage_volume_v2" resource and it seems to work well. The nfs-storage-volume is also created. Can you please provide some more details like any specific features that are enabled/disabled?

dannert commented 2 years ago

@aishwaryabk I'm using a clean install of PvC 2.0.2 without any "add-ons" and using the Cinder v2 interface fails in my environment. Based on documentation I believe the official cinder api version in PvC 2.x is v3? I'm running TF out of WSL environment under Windows.

I observed one other issue preventing successful OCP deploy - image registry POD fails - because PV is not created from NFS provisioner. One reason I believe is the above security name space mismatch, but another issue is incorrect permissions on the created "/export" directory which is owned be root after deploy and NOT set to 777. As soon as I run above "oc" command AND do a chmod 777 on /export the PV is created, bound and the image registry POD comes up. The NFS server in my case is the Bastion deployed from the TF scripts - latest version I could find.

aishwaryabk commented 2 years ago

I tried to run my deployment on Windows as well. However, I am not facing any such issues. @dannert Could you please share your var file without the sensitive credentials if possible?

dannert commented 2 years ago

Here the var.tf I'm [using var.tfvars.txt .

yussufsh commented 2 years ago

@dannert our PowerVC 2.0 supports both v2 and v3 cinder endpoints hence we are not facing the error. Going forward we should extend the volume resource to support v3. With respect to the permission issue on export directory we should go ahead and make that change as we do with PowerVS automation, but the helpernode should set it to 777 anyways before we run ocp4-playbooks. May be it is a race condition here.

However I am not sure why in your case it is using default namespace, that is something we need to debug. I doubt it is because of the cinder api version.. Again, we are using helpernode for configuring the nfs-provisioner.

dannert commented 2 years ago

@yussufsh Re the "name space" that was the first thing I changed based on feedback from a colleague. As it did still not work after that I changed the permissions and then it worked. So, at this time I can not state with confidence if the wrong name space was in place or not after deploy.