Closed Balaji-MP closed 1 year ago
@Balaji-MP It definitely looks like lack of permissions to load eBPF probe. Just in case, could you share the definition of DaemonSet and the SecurityContext you've got in the end?
@erthalion here is the definition and security context within in it
`apiVersion: apps/v1 kind: DaemonSet metadata: annotations: deprecated.daemonset.template.generation: "4" email: support@stackrox.com meta.helm.sh/release-name: stackrox-secured-cluster-services meta.helm.sh/release-namespace: rhacs-operator owner: stackrox creationTimestamp: "2023-02-16T08:25:19Z" generation: 4 labels: app: collector app.kubernetes.io/component: collector app.kubernetes.io/instance: stackrox-secured-cluster-services app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: stackrox app.kubernetes.io/part-of: stackrox-secured-cluster-services app.kubernetes.io/version: 3.73.2 auto-upgrade.stackrox.io/component: sensor helm.sh/chart: stackrox-secured-cluster-services-73.2.0 service: collector name: collector namespace: rhacs-operator ownerReferences:
@Balaji-MP any chance to do kubectl describe ds collector
to get the events as well?
@erthalion here is the events, current state of the pod is CrashLoopBackOff
`Events: Type Reason Age From Message
Normal SuccessfulCreate 10s daemonset-controller Created pod: collector-jst65 Normal SuccessfulCreate 3s daemonset-controller Created pod: collector-x86fj`
@erthalion I guess, the permission issue is caused because of the eval in line 94. I might be wrong, any thoughts on this ?
bootstrap.sh
(including the eval
part) is only responsible for starting Collector. The issue you observe is happening when Collector tries to load eBPF probes.
@erthalion any thoughts on this one ?
What happens if you remove this part from the security context?
seLinuxOptions:
type: container_runtime_t
same error and nothing changed.
@Balaji-MP what about the SCC, you haven't posted it yet, can you show scc/stackrox-collector
?
@erthalion here is the security context in stackrox-collector
securityContext: runAsUser: 1000 runAsGroup: 3000 fsGroup: 2000 containers:
@erthalion here is the security context in stackrox-collector
securityContext: runAsUser: 1000 runAsGroup: 3000 fsGroup: 2000 containers:
There is also a SecurityContextConstraints (SCC), which should have more information, e.g. if a privileged containers are allowed and similar. Having said that, can you describe more your Openshift setup, is there anything special?
@erthalion here is the SCC applied for this collector
`runAsUser: type: RunAsAny seLinuxContext: type: RunAsAny seccompProfiles:
My cluster is standard and no additional restriction are in place.
@erthalion can you please share the directory location where the collector will create the map ??
@erthalion can you please share the directory location where the collector will create the map ??
It's a BPF map, so it's not located on the filesystem. The problem here is your Openshift setup somehow prevent Collector from executing the bpf
syscall, we need to find out why is that.
here is the SCC applied for this collector
runAsUser: type: RunAsAny seLinuxContext: type: RunAsAny seccompProfiles: '*' supplementalGroups: type: RunAsAny
This doesn't look complete, isn't there anything saying something like below?
allowPrivilegeEscalation: true
allowPrivilegedContainer: true
@erthalion no I don't see anything related to allowPriviledged escalation / container.
no I don't see anything related to allowPriviledged escalation / container.
That sounds strange to me. So the output of oc get scc/stackrox-collector -o yaml
doesn't show anything else except what you've posted?
Yes, that's correct
@stackrox/collector-team any updates on this issue?
Unfortunately no, nobody had a capacity to look further into it.
@Balaji-MP TBH Openshift 4.9 is quite dated... might even be out of support? Would it be feasible for you to upgrade to a more recent version?
@porridge let me update to the latest version and can check. In the mean time, do you have a recommended version or above ?
4.12 would be my first choice
@porridge You are correct, after upgrading to version 4.12 it fixed the issue.
Awesome! Let us know if you need anything else.
Hello Team, received the following error while deploying the collector in openshift 4.9. Initially thought this is a permission issue and added the required SCC to collector's service account, but still the issue persists.