migtools / mig-controller

OpenShift Migration Controller
Apache License 2.0
22 stars 42 forks source link

[MIG-740] Not setting a storage class left me with a broken migration #1132

Open jmontleon opened 3 years ago

jmontleon commented 3 years ago

Describe the bug I didn't set a storage class when creating a migration using DVM. I did get a warning, but in this case I would have expected to see the default provisioner used. Instead the storageclass was set to "" on the PVC rather than left off. This does not have the effect of using the Default Provisioner.

Additional context https://kubernetes.io/docs/concepts/storage/persistent-volumes/#class-1

PVCs don't necessarily have to request a class. A PVC with its storageClassName set equal to "" is always 
interpreted to be requesting a PV with no class, so it can only be bound to PVs with no class (no annotation
or one set equal to ""). A PVC with no storageClassName is not quite the same and is treated differently by the
cluster, depending on whether the DefaultStorageClass admission plugin is turned on.

We need to have a "no class" "" option in the drop down for cases like NFS. But we should probably have a (default) option in the drop down that omits setting the storageClassName so the default provisioner is used. Alternatively the user can pick any one of the classes available as is the case now.

jmontleon commented 3 years ago

@pranavgaikwad @alaypatel07 @eriknelson @shawn-hurley don't know if you have opinions counter to mine.

eriknelson commented 3 years ago

Thanks for pinging me; I don't see a lot of these issues when they're filed against GH. I created a jira so we can get this planned onto a sprint because there's some decision to be made here exactly how we expect this to work.

sseago commented 3 years ago

If we decide to offer both "no storageclass" and "ignore what's here, use the default", we'll need to change the MigPlan model, as well as DVM and indirect migration behavior to distinguish between empty and nil here (along with updating structs to use pointers). The intent with the current UI is to have the controller choose a default rather than rely on the target cluster to do so, since it takes into account things like access mode, source storageclass selection, and certain special cases around NFS and Gluster source storage. At the same time, we also need to support the "no storageclass" choice, since a user may want to explicitly leave it empty (because they have NFS volumes preprovisioned). IF we want to also add a new choice for "ignore mig-controller default and choose target cluster default instead", that might make things much more confusing, since we then have two different defaulting mechanisms in play which behave differently from each other.

jmontleon commented 3 years ago

Debugging with @sseago @JaydipGabani it looks like this is a ROKS issue. The default storageclass annotation in ROKS is storageclass.beta.kubernetes.io/is-default-class: "true" instead of storageclass.kubernetes.io/is-default-class: "true"

It seems this is known as some documentation points to it: https://www.ibm.com/support/pages/ibm-cloud-pak-integration-known-limitations

APIC fails to install on a cluster due to invalid memory address

APIC fails to deploy on a cluster and throws the following exception due to a missing storage class annotation:

invalid memory address or nil pointer dereference

the default storage class annotation of ROKS is: 
storageclass.beta.kubernetes.io/is-default-class: 'true'

This problem is solved by adding the annotation of the default storageclass using the below command:

oc patch storageclass rook-ceph-block -p '{"metadata": {"annotations": {"storageclass.kubernetes.io/is-default-class": "true"}}}'

We could work around it if there is a simple way to look for either annotation, or probably just document it similarly if not.

eriknelson commented 3 years ago

CC: @shawn-hurley another one to track