aristanetworks / arista-ceoslab-operator

K8s operator for managing meshnet-networked cEOS-lab instances
Apache License 2.0
11 stars 3 forks source link

Pods require extended privileges but still use the default service account #5

Open raballew opened 1 year ago

raballew commented 1 year ago

This issue is closely related to https://github.com/open-traffic-generator/ixia-c-operator/issues/18. I assume that a similar issue occurs for other vendors as well and it might be worth discussing a more generic solution to this issue, that allows users to run KNE with different flavors of Kubernetes.

When deploying a topology with Arista cEOS nodes to a cluster a pod for each virtual instance is created. Since they are not using a specific service account on OpenShift only minimal privileges are used to run the container. This causes logs in the controller manager such as this entry:

1.667914055966466e+09   ERROR   Failed to create &Pod{ObjectMeta:{r3  3-node-ceos    0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[app:r3 model:ceos os:eos topo:3-node-ceos vendor:ARISTA version:] map[] [{ceoslab.arista.com/v1alpha1 CEosLabDevice r3 9a8a226b-6e62-409b-bd7b-1a955cd2227e 0xc002f1feb1 0xc002f1feb0}] []  []},Spec:PodSpec{Volumes:[]Volume{Volume{Name:volume-configmap-intfmapping-r3,VolumeSource:VolumeSource{HostPath:nil,EmptyDir:nil,GCEPersistentDisk:nil,AWSElasticBlockStore:nil,GitRepo:nil,Secret:nil,NFS:nil,ISCSI:nil,Glusterfs:nil,PersistentVolumeClaim:nil,RBD:nil,FlexVolume:nil,Cinder:nil,CephFS:nil,Flocker:nil,DownwardAPI:nil,FC:nil,AzureFile:nil,ConfigMap:&ConfigMapVolumeSource{LocalObjectReference:LocalObjectReference{Name:configmap-intfmapping-r3,},Items:[]KeyToPath{},DefaultMode:nil,Optional:nil,},VsphereVolume:nil,Quobyte:nil,AzureDisk:nil,PhotonPersistentDisk:nil,PortworxVolume:nil,ScaleIO:nil,Projected:nil,StorageOS:nil,CSI:nil,Ephemeral:nil,},},Volume{Name:volume-configmap-rceos-r3,VolumeSource:VolumeSource{HostPath:nil,EmptyDir:nil,GCEPersistentDisk:nil,AWSElasticBlockStore:nil,GitRepo:nil,Secret:nil,NFS:nil,ISCSI:nil,Glusterfs:nil,PersistentVolumeClaim:nil,RBD:nil,FlexVolume:nil,Cinder:nil,CephFS:nil,Flocker:nil,DownwardAPI:nil,FC:nil,AzureFile:nil,ConfigMap:&ConfigMapVolumeSource{LocalObjectReference:LocalObjectReference{Name:configmap-rceos-r3,},Items:[]KeyToPath{},DefaultMode:*509,Optional:nil,},VsphereVolume:nil,Quobyte:nil,AzureDisk:nil,PhotonPersistentDisk:nil,PortworxVolume:nil,ScaleIO:nil,Projected:nil,StorageOS:nil,CSI:nil,Ephemeral:nil,},},Volume{Name:volume-r3-config,VolumeSource:VolumeSource{HostPath:nil,EmptyDir:nil,GCEPersistentDisk:nil,AWSElasticBlockStore:nil,GitRepo:nil,Secret:nil,NFS:nil,ISCSI:nil,Glusterfs:nil,PersistentVolumeClaim:nil,RBD:nil,FlexVolume:nil,Cinder:nil,CephFS:nil,Flocker:nil,DownwardAPI:nil,FC:nil,AzureFile:nil,ConfigMap:&ConfigMapVolumeSource{LocalObjectReference:LocalObjectReference{Name:r3-config,},Items:[]KeyToPath{},DefaultMode:nil,Optional:nil,},VsphereVolume:nil,Quobyte:nil,AzureDisk:nil,PhotonPersistentDisk:nil,PortworxVolume:nil,ScaleIO:nil,Projected:nil,StorageOS:nil,CSI:nil,Ephemeral:nil,},},Volume{Name:volume-secret-selfsigned-r3-0,VolumeSource:VolumeSource{HostPath:nil,EmptyDir:nil,GCEPersistentDisk:nil,AWSElasticBlockStore:nil,GitRepo:nil,Secret:&SecretVolumeSource{SecretName:secret-selfsigned-r3-0,Items:[]KeyToPath{},DefaultMode:nil,Optional:nil,},NFS:nil,ISCSI:nil,Glusterfs:nil,PersistentVolumeClaim:nil,RBD:nil,FlexVolume:nil,Cinder:nil,CephFS:nil,Flocker:nil,DownwardAPI:nil,FC:nil,AzureFile:nil,ConfigMap:nil,VsphereVolume:nil,Quobyte:nil,AzureDisk:nil,PhotonPersistentDisk:nil,PortworxVolume:nil,ScaleIO:nil,Projected:nil,StorageOS:nil,CSI:nil,Ephemeral:nil,},},},Containers:[]Container{Container{Name:ceos,Image:image-registry.openshift-image-registry.svc:5000/openshift/ceos64:4.28.3M,Command:[/sbin/init],Args:[systemd.setenv=CEOS=1 systemd.setenv=EOS_PLATFORM=ceoslab systemd.setenv=ETBA=1 systemd.setenv=INTFTYPE=eth systemd.setenv=SKIP_ZEROTOUCH_BARRIER_IN_SYSDBINIT=1 systemd.setenv=container=docker],WorkingDir:,Ports:[]ContainerPort{},Env:[]EnvVar{EnvVar{Name:CEOS,Value:1,ValueFrom:nil,},EnvVar{Name:EOS_PLATFORM,Value:ceoslab,ValueFrom:nil,},EnvVar{Name:ETBA,Value:1,ValueFrom:nil,},EnvVar{Name:INTFTYPE,Value:eth,ValueFrom:nil,},EnvVar{Name:SKIP_ZEROTOUCH_BARRIER_IN_SYSDBINIT,Value:1,ValueFrom:nil,},EnvVar{Name:container,Value:docker,ValueFrom:nil,},},Resources:ResourceRequirements{Limits:ResourceList{},Requests:ResourceList{cpu: {{500 -3} {<nil>} 500m DecimalSI},memory: {{1073741824 0} {<nil>} 1Gi BinarySI},},},VolumeMounts:[]VolumeMount{VolumeMount{Name:volume-configmap-intfmapping-r3,ReadOnly:false,MountPath:/mnt/flash/EosIntfMapping.json,SubPath:EosIntfMapping.json,MountPropagation:nil,SubPathExpr:,},VolumeMount{Name:volume-configmap-rceos-r3,ReadOnly:false,MountPath:/mnt/flash/rc.eos,SubPath:rc.eos,MountPropagation:nil,SubPathExpr:,},VolumeMount{Name:volume-r3-config,ReadOnly:false,MountPath:/mnt/flash/startup-config,SubPath:startup-config,MountPropagation:nil,SubPathExpr:,},VolumeMount{Name:volume-secret-selfsigned-r3-0,ReadOnly:false,MountPath:/mnt/flash/gnmiCert.pem,SubPath:gnmiCert.pem,MountPropagation:nil,SubPathExpr:,},VolumeMount{Name:volume-secret-selfsigned-r3-0,ReadOnly:false,MountPath:/mnt/flash/gnmiCertKey.pem,SubPath:gnmiCertKey.pem,MountPropagation:nil,SubPathExpr:,},},LivenessProbe:nil,ReadinessProbe:nil,Lifecycle:nil,TerminationMessagePath:,ImagePullPolicy:IfNotPresent,SecurityContext:&SecurityContext{Capabilities:nil,Privileged:*true,SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,AllowPrivilegeEscalation:nil,RunAsGroup:nil,ProcMount:nil,WindowsOptions:nil,SeccompProfile:nil,},Stdin:false,StdinOnce:false,TTY:false,EnvFrom:[]EnvFromSource{},TerminationMessagePolicy:,VolumeDevices:[]VolumeDevice{},StartupProbe:&Probe{ProbeHandler:ProbeHandler{Exec:&ExecAction{Command:[wfw -t 5],},HTTPGet:nil,TCPSocket:nil,GRPC:nil,},InitialDelaySeconds:0,TimeoutSeconds:5,PeriodSeconds:5,SuccessThreshold:0,FailureThreshold:24,TerminationGracePeriodSeconds:nil,},},},RestartPolicy:,TerminationGracePeriodSeconds:*0,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[]LocalObjectReference{},Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[]Container{Container{Name:init-r3,Image:networkop/init-wait:latest,Command:[],Args:[7 0],WorkingDir:,Ports:[]ContainerPort{},Env:[]EnvVar{},Resources:ResourceRequirements{Limits:ResourceList{},Requests:ResourceList{},},VolumeMounts:[]VolumeMount{},LivenessProbe:nil,ReadinessProbe:nil,Lifecycle:nil,TerminationMessagePath:,ImagePullPolicy:IfNotPresent,SecurityContext:nil,Stdin:false,StdinOnce:false,TTY:false,EnvFrom:[]EnvFromSource{},TerminationMessagePolicy:,VolumeDevices:[]VolumeDevice{},StartupProbe:nil,},},AutomountServiceAccountToken:nil,Tolerations:[]Toleration{},HostAliases:[]HostAlias{},PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[]PodReadinessGate{},RuntimeClassName:nil,EnableServiceLinks:nil,PreemptionPolicy:nil,Overhead:ResourceList{},TopologySpreadConstraints:[]TopologySpreadConstraint{},EphemeralContainers:[]EphemeralContainer{},SetHostnameAsFQDN:nil,OS:nil,},Status:PodStatus{Phase:,Conditions:[]PodCondition{},Message:,Reason:,HostIP:,PodIP:,StartTime:<nil>,ContainerStatuses:[]ContainerStatus{},QOSClass:,InitContainerStatuses:[]ContainerStatus{},NominatedNodeName:,PodIPs:[]PodIP{},EphemeralContainerStatuses:[]ContainerStatus{},},} for CEosLabDevice r3    {"controller": "ceoslabdevice", "controllerGroup": "ceoslab.arista.com", "controllerKind": "CEosLabDevice", "cEosLabDevice": {"name":"r3","namespace":"3-node-ceos"}, "namespace": "3-node-ceos", "name": "r3", "reconcileID": "8f724d2a-5d85-4af0-b2ac-9ceb2c9c0e8a", "error": "pods \"r3\" is forbidden: unable to validate against any security context constraint: [provider \"anyuid\": Forbidden: not usable by user or serviceaccount, spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed, provider \"nonroot-v2\": Forbidden: not usable by user or serviceaccount, provider \"nonroot\": Forbidden: not usable by user or serviceaccount, provider \"hostmount-anyuid\": Forbidden: not usable by user or serviceaccount, provider \"machine-api-termination-handler\": Forbidden: not usable by user or serviceaccount, provider \"hostnetwork-v2\": Forbidden: not usable by user or serviceaccount, provider \"hostnetwork\": Forbidden: not usable by user or serviceaccount, provider \"hostaccess\": Forbidden: not usable by user or serviceaccount, provider \"node-exporter\": Forbidden: not usable by user or serviceaccount, provider \"privileged\": Forbidden: not usable by user or serviceaccount]"}

For more details please check the attached log file.

arista-ceoslab-operator-controller-manager-5ff748b8db-x6wmn-manager.log

This seems to be fixable by extending the privileges of the default service account as shown below, but in general this is not a practice recommend anywhere as other pods that do not specify an other service account will also inherit these privileges.

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: privileged-role
rules:
  - apiGroups:
      - security.openshift.io
    resourceNames:
      - privileged
    resources:
      - securitycontextconstraints
    verbs:
      - use
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: privileged-rolebinding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: privileged-role
subjects:
  - kind: ServiceAccount
    name: default

A better solution would be to use a dedicated service account for pods created by the controller, so that extending privileges is limited to a specific set of application running in this namespace.

frasieroh commented 1 year ago

Thanks for bringing this to my attention.

To my knowledge this is the first time anyone has used anything other than a local "kind" cluster for KNE (we certainly haven't been doing this). The intended model for KNE is that the cluster itself is disposable and that it's only running other network OS and traffic generator containers.

From what I can gather, "security context constraints" are an OpenShift-specific construct and other cloud providers have approximately similar offerings for controlling privileges on a per-pod or per-service-account basis. We aren't interested in integrating any technology specific to a given cloud, so there is no possibility the operator will transparently configure these constraints.

If I'm understanding correctly, what you're asking is:

  1. Create a non-default service account for the controller
  2. Use this service account for pods
  3. Cluster operators can then configure the permissions of this service account appropriately, or ignore this mechanism if their environment does not enforce permissions (for example, kind)

So it would still be your responsibility to make sure the cluster is configured to allow the new service account to create privileged containers, but exactly how that's done wouldn't be tied to the operator.

raballew commented 1 year ago

@burnyd has tried running KNE on an actual k8s cluster and this is where I originally got the idea from. While emulating a topology on kind certainly works for most developers, my idea was to setup CI workflows to pre-validate hardware tests in a central instance. Also in my experience, the system requirements of running a larger topology on your local machine usually exceeds the system specs.

I understand that SCCs are an OpenShift-specific construct and being vendor agnostic is the right way to go. Anyhow, what I would need is the following:

➜  ~ kubectl get pods -n ceosnms dc1-spine1 -o yaml
    apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2022-11-01T14:52:57Z"
  labels:
    app: dc1-spine1
    topo: ceosnms
  name: dc1-spine1
  namespace: ceosnms
  ownerReferences:
  - apiVersion: [ceoslab.arista.com/v1alpha1](http://ceoslab.arista.com/v1alpha1)
    blockOwnerDeletion: true
    controller: true
    kind: CEosLabDevice
    name: dc1-spine1
    uid: e22c2913-7d26-48ca-ad33-da0c071b7a98
  resourceVersion: "2869042"
  uid: 2135a16b-8bcf-4906-919c-996545b9bebb
spec:
  <truncated>
  securityContext: {}
  serviceAccount: default

Change the service account default for this kind of pod to something else that is well known in advance, lets say ceos-lab. This would allow the cluster ops team to patch RBAC for ceos-lab individually without impacting other workloads privileges in the namespace. Maybe this is even a feature for KNE itself, where the name of the SA that should be used can be configured.

alexmasi commented 1 year ago

Just a note: KNE definitely wants to support other cluster types than just kind. We have been working internally to use KNE on a multinode cluster solution (kubeadm).

Currently in the deployment config you can specify External instead of Kind which allows the user to bring their own cluster and the KNE deps get deployed on top of that existing cluster. The user would then be responsible for lifecycle management of the cluster