go-vela / community

Community Information for Vela (Target's official Pipeline Automation Framework)
https://go-vela.github.io/docs/
Apache License 2.0
22 stars 3 forks source link

worker: Kubernetes Runtime Container SecurityContext #515

Open cognifloyd opened 2 years ago

cognifloyd commented 2 years ago

Description

Add SecurityContext to containers in the kubernetes runtime as noted in this TODO:

https://github.com/go-vela/worker/blob/19081a02a335086f774e8a73f595e9efb873659d/runtime/kubernetes/container.go#L173

    // TODO: add SecurityContext options (runAsUser, runAsNonRoot, sysctls)

There are several settings that can be configured in SecurityContext, some of which can only be set on the whole Pod, others can only be set per-container, and others can be set in either context (container-level overriding pod-level).

Setting Type Pod Container Configure
Worker
w/ default
Step
override
(if worker allows)
Pipeline
override
(if worker allows)
Apply if
volumes
need it
allowPrivilegeEscalation boolean :white_check_mark: :heavy_minus_sign: :heavy_minus_sign:
implicitly toggled
by privileged
:heavy_minus_sign:
capabilities object :white_check_mark: :heavy_plus_sign:
:heavy_check_mark: via CRD
:heavy_plus_sign: :heavy_minus_sign:
- add, drop string arrays
fsGroup integer :white_check_mark: :heavy_minus_sign: :heavy_minus_sign: :heavy_minus_sign: :floppy_disk:
fsGroupChangePolicy string :white_check_mark: :heavy_minus_sign: :heavy_minus_sign: :heavy_minus_sign: :floppy_disk:
privileged boolean :white_check_mark: :heavy_plus_sign:
:heavy_check_mark: allow list
of images
via opt/env
:heavy_plus_sign:
config exists
:heavy_minus_sign:
procMount string :white_check_mark: :heavy_minus_sign: :heavy_minus_sign: :heavy_minus_sign:
readOnlyRootFilesystem boolean :white_check_mark: :heavy_minus_sign: :heavy_minus_sign: :heavy_minus_sign: :floppy_disk:
runAsGroup integer :white_check_mark: :white_check_mark: :heavy_minus_sign: :heavy_plus_sign: :grey_question:
runAsNonRoot boolean :white_check_mark: :white_check_mark: :heavy_plus_sign:
:heavy_check_mark: pod-level only via CRD
:heavy_plus_sign: :grey_question:
runAsUser integer :white_check_mark: :white_check_mark: :heavy_minus_sign: :heavy_plus_sign:
config exists
(missing in k8s)
:heavy_minus_sign:
seLinuxOptions object :white_check_mark: :white_check_mark: :grey_question: :grey_question: :grey_question:
- level, role, type, user strings
seccompProfile object :white_check_mark: :white_check_mark: :grey_question: :grey_question: :grey_question:
- localhostProfile, type strings
supplementalGroups integer array :white_check_mark: :heavy_minus_sign: :heavy_minus_sign: :grey_question: :floppy_disk:
sysctls object array :white_check_mark: :heavy_plus_sign:
:heavy_check_mark: via CRD
:heavy_minus_sign:
ulimits exists

(but k8s can't do
per-step sysctls)
:heavy_plus_sign:
- name, value strings
windowsOptions object :white_check_mark: :white_check_mark: N/A N/A N/A
- gmsa*, hostProcess, runAsUserName mixed

Value

Allow using Vela in clusters where an admissions controller blocks the creation of pods unless SecurityContext requirements are met. Make pipelines follow the principle of least-privileges: Like "privileged" only increase access if requested (and permitted).

Definition of Done

Effort (Optional)

Adding the SecurityContext should be straight-forward. But, how to configure that is not clear.

Impacted Personas (Optional)

Anyone who uses the kubernetes runtime and wants to apply SecurityContext settings.

cognifloyd commented 2 years ago

So, how do we add more complex config to the worker like this? Shove JSON into env vars?

JordanSussman commented 2 years ago

So, how do we add more complex config to the worker like this? Shove JSON into env vars?

Do you envision that you would want to configure this at the global (server) level and/or at the individual pipeline (.vela.yml) level? I suppose we could add more configuration options underneath the worker key if you want it to be somewhat configurable at the pipeline level.

cognifloyd commented 2 years ago

Do you envision that you would want to configure this at the global (server) level and/or at the individual pipeline (.vela.yml) level?

Hmm. I imagine a hybrid model.

That is pretty straight forward (I hope), until you consider pod SecurityContext settings that can't be configured per step. If any of that needs to be configured per pipeline, then we would need some new top-level security tag.

I'll think a bit more and then I'll extend my chart to show where things could be configurable (pipeline, pipeline step, worker default, worker allows changing).

cognifloyd commented 2 years ago

Worker Runtime Config

OK. So the worker doesn't need as many SecurityContext defaults as I thought.

Currently, the worker has one option:

Here, I'm proposing we add some kind of config to configure per-worker defaults for (the type, in italics, could be easily defined in new ENV vars or CLI args):

edit: added checkmark to show that these were made configurable. Instead of env/opt, they are configurable via a CRD

I don't have a clear idea of when/how people would want to use seLinuxOptions or seccompProfile, so I'm ignoring those for now.

cognifloyd commented 2 years ago

Add tags to Step/Service

The Step or Service already define these SecurityContext-related tags:

Here are the things that I'd like to see configurable (eventually) per pipeline step/service:

I don't have a clear idea of when/how people would want to use seLinuxOptions or seccompProfile, so I'm ignoring those for now.

cognifloyd commented 2 years ago

Add new pipeline-wide config block

With the docker runner, the lifecycle and settings of each step/service are basically isolated, so there is no requirement to configure something for all steps and services.

With the kubernetes runner, however, the pipeline maps to the pod, and some setting can only be set at the pod level. So we might need to expose some top-level pipeline config to configure those pod-level settings.

The biggest candidate for this is:

Once there is such a pipeline-wide config mechanism, these are also candidates for that:

Plus, such a pipeline-wide config block could be used to simplify repetitive definitions of other settings like user, pull, and capabilities across all steps/services.

I don't have a clear idea of when/how people would want to use seLinuxOptions or seccompProfile, so I'm ignoring those for now.

cognifloyd commented 2 years ago

SecurityContext and Volumes

Currently, only host volumes are supported. In the future, we might expose additional volume types. When that happens, several SecurityContext settings might need to be added implicitly to support that volumes feature. Any new config in the pipeline would probably be tied to those volumes.

Also, the procMount setting seems very esoteric (not useful) to me, so I skipped that, but I suppose it could also be part of the volumes config if someone needed it.

cognifloyd commented 2 years ago

OK. I've added a chart in the issue description. Then I added a comment to summarize / describe each of the 4 right most columns.

I need capabilities. I want to configure most of my workers with capabilities.drop=ALL and then add the required capabilities to each step/service.

cognifloyd commented 2 years ago

I suppose we could add more configuration options underneath the worker key if you want it to be somewhat configurable at the pipeline level.

@JordanSussman Oh. You weren't suggesting adding additional routing. You were suggesting that pipeline-level config could be defined under the worker key.

So, something like this (example sysctls from k8s docs)?

worker:
  flavor: foobar
  platform: k8s
  runAsGroup: 1234
  runAsNonRoot: true
  supplementalGroups:
    - 5678
    - 9012
  sysctls:
    kernel.shm_rmid_forced: "0"
    net.core.somaxconn: "1024"
    kernel.msgmax: "65536"
kneal commented 2 years ago

I think I like keeping the controls on the admin side. I do think it could expand into maybe some new routing keys in the worker: block but I'm not sure exposing that at the user level would work super well with the current setup.

cognifloyd commented 2 years ago

Worker Runtime Config

Instead of relying on an ever increasing list of env vars / options, I added a PipelinePodsTemplate CRD that allows admins to specify defaults to use in the Pods created by that worker.

Now that go-vela/worker#294 is merged, the admin can configure these bits:

I've added check marks in the chart to mark which things are implemented.

kneal commented 2 years ago

Instead of relying on an ever-increasing list of env vars

It's probably worth noting that one of the reasons we use urfave/cli so heavily for injecting config is setting the configuration with it has a lot of options. You don't necessarily have to use env. You could have a like agent.yml or server.yml within the deployed service container to set the config.

It's not really documented on our side anywhere but a feature of the library. It's how/why the Vela CLI has a config file option

cognifloyd commented 2 years ago

You could have a like agent.yml or server.yml within the deployed service container to set the config.

I thought that that required a separate file for each config option, doesn't it? Is there a way to put all of those config options in a single file with urfave/cli?

cognifloyd commented 2 years ago

Another thought: When we expand the pipeline YAML to make more of these options configurable, the admin will need a way to configure the worker to say which of those options are allowed on a per-worker basis. We can easily expand the CRD to cover "allowed pipeline overrides".

kneal commented 2 years ago

We might need to add a new flag for a single entrypoint but you can do a single file. Here's the CLI one: https://github.com/go-vela/cli/blob/master/cmd/vela-cli/main.go#L72-L80

All of the ENV configs in the CLI can be used within that file with the name parameter in the flag. So, we could likely just add a new flag like the first one I linked and create a much simpler admin experience.

cognifloyd commented 2 years ago

We might need to add a new flag for a single entrypoint but you can do a single file. Here's the CLI one: https://github.com/go-vela/cli/blob/master/cmd/vela-cli/main.go#L72-L80

All of the ENV configs in the CLI can be used within that file with the name parameter in the flag. So, we could likely just add a new flag like the first one I linked and create a much simpler admin experience.

Oh. Cool! We'll probably want to clean up the Name: of all the options before we expose that so that they're a bit more consistent (. vs _ vs - vs camelCase).

kneal commented 2 years ago

Yeah, that would be a big thing because the hyphen or underscore will keep the key on the same level as the YAML file. The dot syntax turns that section into an object with keys underneath. Which also can be seen as an example in the CLI.