opendatahub-io / opendatahub-operator

Open Data Hub operator to manage ODH component integrations
https://opendatahub.io
Apache License 2.0
60 stars 139 forks source link

fix: change user of operator to 1001 #1155

Open zdtsw opened 2 months ago

zdtsw commented 2 months ago

Description

https://issues.redhat.com/browse/RHOAIENG-10272

How Has This Been Tested?

local build: quay.io/wenzhou/opendatahub-operator-catalog:v2.13.10272-3 test:

Screenshot or short clip

Merge criteria

openshift-ci[bot] commented 2 months ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please ask for approval from zdtsw. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/opendatahub-io/opendatahub-operator/blob/incubation/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
bartoszmajsak commented 2 months ago

I recall facing a similar problem as described in linked JIRA when looking at bringing Workbenches to Service Mesh - #616

It turned out runAs on pods were not aligned with what openshift was setting

annotations:
    openshift.io/sa.scc.mcs: s0:c26,c25
    openshift.io/sa.scc.supplemental-groups: 1000700000/10000
    openshift.io/sa.scc.uid-range: 1000700000/10000

Changing default SA to restricted-v2 did the job,

https://github.com/opendatahub-io/opendatahub-operator/blob/286caf966aa7c6046726aac25abed456254baee2/controllers/dscinitialization/utils.go#L164

but I am not sure this is relevant in this context. Though might be a thing to work on as well.

bartoszmajsak commented 2 months ago

If the RunAsUser field is absent in the manifest, vanilla k8s will honor the value you set in the Dockerfile, but when running on OpenShift, it will automatically assign UID that satisfies its requirements. Have you tried that @zdtsw?

We did similar work in the context of Istio and Openshift https://github.com/istio/istio/pull/45394

ykaliuta commented 2 months ago

Can we just make /opt/manifests group writeable? (in theory volatile content should be copied to /var, but it's too much for us)

zdtsw commented 2 months ago

If the RunAsUser field is absent in the manifest, vanilla k8s will honor the value you set in the Dockerfile, but when running on OpenShift, it will automatically assign UID that satisfies its requirements. Have you tried that @zdtsw?

We did similar work in the context of Istio and Openshift istio/istio#45394

interesting topic of these permissions :D

I think we have 2 parts here:

  1. which uid to start Operator pod: related to the one we used to build image, the SA we assigned in deployment which passed to pod runtime (for this PR and its jira)
  2. which role (in our case it is set to anyuid) to be used for binding to "default" SA in application namespace: this probably is for component's controllers to use if they do not specify SA (e.g workbench?) and override "restricted-v2" to "anyuid" to grant more permission? tbh, i am not really sure why or if we do need this block of code , unless by using "restricted-v2" can cause any problem (e.g components controller wanna run as root????)

after some reading from https://docs.openshift.com/container-platform/4.16/authentication/managing-security-context-constraints.html#security-context-constraints-pre-allocated-values_configuring-internal-oauth this sounds like, these annotations on NS level are taking effects if pod .spec does not set anything.

As for the change in istio if i understand it correctly:func getPreallocatedSupplementalGroups(ns *corev1.Namespace) ([]securityv1.IDRange, error) is called after pod is up, in order to generate template? which is different from our case in Operator itself

I did a test:

BUT if i remove priority from restricted-pipeline-scc then Operator Pod works: uid is to 1001

I guess/think/assume the whole combination:

zdtsw commented 2 months ago

Can we just make /opt/manifests group writeable? (in theory volatile content should be copied to /var, but it's too much for us)

the use case for our Operator with write permission:

This is likely the biggest difference of ours and other products operator. they do not need to set uid and can use namespace populated range as running id, which does not have problem unless there is any on-disk write operator happens there.

we could make all manifests live under /var/manifests instead of /opt/manifests but this does not really resolve the problem. the current user in Operator pod is uid=1001(1001) gid=0(root) groups=0(root) so i feel skeptical to open more (on group level) than narrow down what exactly is needed(for specific user)

ykaliuta commented 2 months ago

Can we just make /opt/manifests group writeable? (in theory volatile content should be copied to /var, but it's too much for us)

the use case for our Operator with write permission:

  • update existing manifests files: this is something i dont have a better solution for now.

What the problem to make it group writable?

  • create tmp files before final dump into ^ these files. (this we could have re-write the logic to avoid)

This is likely the biggest difference of ours and other products operator. they do not need to set uid and can use namespace populated range as running id, which does not have problem unless there is any on-disk write operator happens there.

we could make all manifests live under /var/manifests instead of /opt/manifests but this does not really resolve the problem.

If on startup it copies RO /opt/manifest to /var/manifests under its UID with typical umask, it gets writable directory.

the current user in Operator pod is uid=1001(1001) gid=0(root) groups=0(root) so i feel skeptical to open more (on group level) than narrow down what exactly is needed(for specific user)

What are the security concerns about that UIDs in the container? There is no sensitive content, the UIDs are virtual from the host point of view.

zdtsw commented 2 months ago

we will need to revisit this UID late. mark it to draft (should close but then often it gets lost in a while)