stackabletech / issues

This repository is only for issues that concern multiple repositories or don't fit into any specific repository
2 stars 0 forks source link

Investigate getting rid of custom SCCs #607

Closed razvan closed 3 months ago

razvan commented 4 months ago

Description

Currently, our OLM packages deploy a custom SecurityContextConstraint based on the hostmount-anyuid SCC. This issue is about getting rid of that custom SCC and move to the nonroot-v2 SCC.

Background / Research

Using the SecurityContext.RunAsUser functionality users can, in theory, set the user that an image is being run as to an arbitrary uid. Our operators currently (as of 2024-07-16) hardcode this to 1000:

pub const NIFI_UID: i64 = 1000;
...
PodSecurityContextBuilder::new()
  .run_as_user(NIFI_UID)
  .run_as_group(0)
  .fs_group(1000)
  .build()

For this to work (read: to be allowed) at least on OpenShift we need to set a SecurityContextConstraint that allows us to run as an arbitrary (non-root) UID.

In the past we deployed our own SCC derived from the hostmount-anyuid (this looked different back then) SCC because the default SCCs didn't allow ephemeral volumes. This has changed in September 2022 without us noticing and all default SCCs now include the ephemeral permission. The hostmount-anyuid SCC allows (as the name implies) Pods to run as any user including root. This is not good and we can probably (we ran some tests and it looks good) switch to nonroot-v2 which is slightly less strict than restricted-v2.

[!NOTE]
nonroot-v2 provides all features of the restricted-v2 SCC, but allows users to run with any non-root UID.

This is a change we can do immediately and should not require any code changes, it should only require changes in the bundling of our OLM packages and it needs to be documented/in the release notes.

Our final goal should be to move to restricted-v2 wherever possible but that is out-of-scope for this issue and will be part of a follow-up.

Value

Dependencies

Tasks

Acceptance Criteria

(Information Security) Risk Assessment

This will improve our security of our product as it will allow us to run on default settings of OpenShift without our customers having to audit a custom SCC. It will also allow us to move to a more restrictive SCC going forward.

Quality

We need to run all integration tests (and maybe even demos) across all our products.

Release Notes

Historically our Operator Lifecycle Manager packaging (OLM) for OpenShift would deploy a custom SecurityContextConstraint. This used to be required for us to be able to create ephemeral volumes. All default SCCs in OpenShift in all our supported OpenShift versions now allow this by default. This allows us to switch to use the nonroot-v2 SCC by default.

adwk67 commented 3 months ago

Manifests created, operator deployed (🟠) and Openshift test suite run (🟒) for the following operators with no custom SCC:

adwk67 commented 3 months ago

TODOs / unclear

adwk67 commented 3 months ago

stackable-utils changes for this issue: https://github.com/stackabletech/stackable-utils/pull/86

razvan commented 3 months ago

This issue addresses the security context part described here.

This issue can now be considered done.

sbernauer commented 3 months ago

This issue can now be considered done.

@razvan can we move this to "Development Done"? I'm not sure what to review here to be honest πŸ™ˆ

adwk67 commented 3 months ago

Yes, closing this as outstanding work will be addressed in separate tickets.

PaulienVa commented 2 months ago

@adwk67 will this also be fixed for the airflow-operator? We also need it there/

adwk67 commented 2 months ago

@PaulienVa yes, this will be done for all operators: please follow this issue for updates/progress etc.