nerc-project / operations

Issues related to the operation of the NERC OpenShift environment
2 stars 0 forks source link

Issue with users' jobs withing a pod isolation #734

Closed Milstein closed 2 months ago

Milstein commented 2 months ago

Issue with users' jobs withing a pod isolation:

We are close to getting our app running on NERC. However, we still have one issue. For isolation purposes, we leverage 2 users inside our pod. The positron user (UID 100197001) starts a job process inside the pod (using subprocess) and runs it as the job_user (UID 100197002). Our dockerfile sets up both users appropriately, and this approach works on K8s clusters outside of openshift. It appears that something in the SCC is blocking this from happening.

The subprocess call is returning the following error when trying to set the user for the command: su: cannot set groups: Operation not permitted

Is it possible to update a config on your side so that we can perform the operation as intended? Please advise.

Thx,

Matt Brewster

larsks commented 2 months ago

@Milstein I've just created https://github.com/OCP-on-NERC/nerc-ocp-config/pull/535, which grants the necessary privileges to the robbie-job-runner ServiceAccount to make this work.

Matt will need to update his Deployment to (a) use the robbie-job-runner ServiceAccount, if it isn't already and (b) set an explicit UID on the container. That might look something like this:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: uidexample
spec:
  template:
    spec:

      # Run pod using the robbie-job-runner serviceaccount.
      serviceAccountName: robbie-job-runner
      containers:
        - name: uidexample
          image: uidexample:latest

          # Run the pod with the uid of the "positron" user
          securityContext:
            runAsUser: 1001970001

This should work as soon as the PR closes.

larsks commented 2 months ago

You can find my test environment for this issue here.