Open eero-t opened 2 years ago
Tuomas fixed the deployment rule inconsistency in https://github.com/intel/intel-device-plugins-for-kubernetes/pull/1169 and Ukri is fixing overlay object names inconsistency in https://github.com/intel/intel-device-plugins-for-kubernetes/pull/1172.
As to remaining namespace inconsistency...
According to @ukri, it's better that manually installed GPU plugin, and operator operating automatically on k8s GPU plugin components, are in separate namespaces. But at least the namespace overlay and install doc example should use the same namespace.
Namespace can be given as kubectl/kustomize command line option. Maybe it's best if namespace overlay is deprecated (+ eventually removed), and install doc uses e.g. "intel-gpu" as the namespace in kubectl example?
(I suggested "intel-gpu" namespace to indicate that pods in it are Intel related components unlike generic stuff in "kube-system", which is used by the current namespace overlay.)
I grepped the namespaces from deployments again:
$ grep -sr namespace: * | grep "namespace: ." | grep -v -e " system" -e inteldeviceplugins-system
fpga_admissionwebhook/default/kustomization.yaml:namespace: intelfpgawebhook-system
fpga_plugin/overlays/af/kustomization.yaml:namespace: intelfpgaplugin-system
fpga_plugin/overlays/region/kustomization.yaml:namespace: intelfpgaplugin-system
gpu_plugin/overlays/fractional_resources/gpu-manager-rolebinding.yaml: namespace: default
nfd/overlays/node-feature-discovery/node-feature-discovery-openshift.yaml: namespace: openshift-nfd
operator/manifests/bases/intel-device-plugins-operator.clusterserviceversion.yaml: namespace: placeholder
sgx_admissionwebhook/overlays/default-with-certmanager/kustomization.yaml:namespace: intelsgxwebhook-system
sgx_plugin/overlays/epc-register/service-account.yaml: namespace: kube-system
sgx_plugin/overlays/epc-register/service-account.yaml: namespace: kube-system
sgx_plugin/overlays/epc-register/kustomization.yaml:namespace: kube-system
xpumanager_sidecar/kustomization.yaml:namespace: monitoring
Some suggestions:
intelfpgawebhook-system
and intelfpgaplugin-system
as inteldeviceplugins-system
?
intelsgxwebhook-system
as inteldeviceplugins-system
?kube-system
as inteldeviceplugins-system
?Suggestions look fine to me. @mythi ?
I noticed some confusing inconsistencies when checking things under
deployments
folder and different installation methods...GPU plugin is installed on different nodes depending on used method:
IMHO at least the last two first options (relying on NFD), should be the same, not differ:
gpu_plugin/overlays/nfd_labeled_nodes
, they go only to nodes with Intel GPUs supporting display outputWhereas inconsistencies in namespace usage between installation methods could mean that user ends accidentally with multiple instances of the same thing running in different namespaces:
For example, when using operator, GPU plugin goes to
inteldeviceplugins-system
ns, when using GPU plugin README apply method, it goes todefault
ns, and when usinggpu_plugin/overlays/namespace_kube-system
overlay, it goes tokube-system
ns. Some of the other plugins, use different namespaces, but that's not consistent either.GPU plugin also uses different service account & roles depending on installation method (this is from repo root):