Open cognifloyd opened 2 years ago
So, how do we add more complex config to the worker like this? Shove JSON into env vars?
So, how do we add more complex config to the worker like this? Shove JSON into env vars?
Do you envision that you would want to configure this at the global (server) level and/or at the individual pipeline (.vela.yml) level? I suppose we could add more configuration options underneath the worker key if you want it to be somewhat configurable at the pipeline level.
Do you envision that you would want to configure this at the global (server) level and/or at the individual pipeline (.vela.yml) level?
Hmm. I imagine a hybrid model.
worker
block you linked to.flavor
.privileged
.That is pretty straight forward (I hope), until you consider pod SecurityContext settings that can't be configured per step. If any of that needs to be configured per pipeline, then we would need some new top-level security
tag.
I'll think a bit more and then I'll extend my chart to show where things could be configurable (pipeline, pipeline step, worker default, worker allows changing).
OK. So the worker
doesn't need as many SecurityContext defaults as I thought.
Currently, the worker has one option:
runtime.privileged-images
configuration to enable pipelines steps/services to use the privileged
tagHere, I'm proposing we add some kind of config to configure per-worker defaults for (the type, in italics, could be easily defined in new ENV vars or CLI args):
capabilities.add
string listcapabilities.drop
string listrunAsNonRoot
booleansysctls
list of key:value pairs (?)edit: added checkmark to show that these were made configurable. Instead of env/opt, they are configurable via a CRD
I don't have a clear idea of when/how people would want to use seLinuxOptions
or seccompProfile
, so I'm ignoring those for now.
Step
/Service
The Step
or Service
already define these SecurityContext-related tags:
privileged
(which k8s also uses to manage allowPrivilegeEscalation
, so we can ignore that)user
which would define runAsUser
but has not been implemented in the kubernetes runtime yet. This has probably not been implemented because user
is a string, but runAsUser
is an integer, so we'd need a way to look up the user id to use based on the user
tag.ulimits
which could be implemented using sysctls
in SecurityContext, but that would apply to all containers/steps/services in the pipeline, not just the step the ulimits are defined on. So, the k8s runtime cannot support the ulimits
tag.Here are the things that I'd like to see configurable (eventually) per pipeline step/service:
capabilites
map would be merged with the worker's defaults iff the worker allows them to be modified.runAsNonRoot
boolean overrides worker's default iff the worker allows that.runAsGroup
integer the group id (not sure how much value this would have in pipelines)I don't have a clear idea of when/how people would want to use seLinuxOptions
or seccompProfile
, so I'm ignoring those for now.
With the docker runner, the lifecycle and settings of each step/service are basically isolated, so there is no requirement to configure something for all steps and services.
With the kubernetes runner, however, the pipeline maps to the pod, and some setting can only be set at the pod level. So we might need to expose some top-level pipeline config to configure those pod-level settings.
The biggest candidate for this is:
sysctls
which is mentioned in the code comment TODO, but cannot be defined per container/step/service.Once there is such a pipeline-wide config mechanism, these are also candidates for that:
runAsGroup
runAsNonRoot
supplementalGroups
Plus, such a pipeline-wide config block could be used to simplify repetitive definitions of other settings like user
, pull
, and capabilities
across all steps/services.
I don't have a clear idea of when/how people would want to use seLinuxOptions
or seccompProfile
, so I'm ignoring those for now.
Currently, only host volumes are supported. In the future, we might expose additional volume types. When that happens, several SecurityContext settings might need to be added implicitly to support that volumes feature. Any new config in the pipeline would probably be tied to those volumes.
fsGroup
fsGroupPolicy
readOnlyRootFilesystem
supplementalGroups
Also, the procMount
setting seems very esoteric (not useful) to me, so I skipped that, but I suppose it could also be part of the volumes config if someone needed it.
OK. I've added a chart in the issue description. Then I added a comment to summarize / describe each of the 4 right most columns.
I need capabilities
. I want to configure most of my workers with capabilities.drop=ALL
and then add the required capabilities to each step/service.
I suppose we could add more configuration options underneath the worker key if you want it to be somewhat configurable at the pipeline level.
@JordanSussman Oh. You weren't suggesting adding additional routing. You were suggesting that pipeline-level config could be defined under the worker
key.
So, something like this (example sysctls from k8s docs)?
worker:
flavor: foobar
platform: k8s
runAsGroup: 1234
runAsNonRoot: true
supplementalGroups:
- 5678
- 9012
sysctls:
kernel.shm_rmid_forced: "0"
net.core.somaxconn: "1024"
kernel.msgmax: "65536"
I think I like keeping the controls on the admin side. I do think it could expand into maybe some new routing keys in the worker:
block but I'm not sure exposing that at the user level would work super well with the current setup.
Instead of relying on an ever increasing list of env vars / options, I added a PipelinePodsTemplate
CRD that allows admins to specify defaults to use in the Pods created by that worker.
Now that go-vela/worker#294 is merged, the admin can configure these bits:
container.SecurityContext.capabilities
(both add
and drop
)SecurityContext.RunAsNonRoot
(note that this is a validation flag - it forces the pod to fail if all container images are not configured to run as some user other than root. Also note that only the pod-level flag is available. We can add container-level once we have a way to configure/override from the vela yaml pipeline)SecurityContext.Sysctls
(here be dragons!)
I've added check marks in the chart to mark which things are implemented.
Instead of relying on an ever-increasing list of env vars
It's probably worth noting that one of the reasons we use urfave/cli so heavily for injecting config is setting the configuration with it has a lot of options. You don't necessarily have to use env. You could have a like agent.yml
or server.yml
within the deployed service container to set the config.
It's not really documented on our side anywhere but a feature of the library. It's how/why the Vela CLI has a config file option
You could have a like agent.yml or server.yml within the deployed service container to set the config.
I thought that that required a separate file for each config option, doesn't it? Is there a way to put all of those config options in a single file with urfave/cli?
Another thought: When we expand the pipeline YAML to make more of these options configurable, the admin will need a way to configure the worker to say which of those options are allowed on a per-worker basis. We can easily expand the CRD to cover "allowed pipeline overrides".
We might need to add a new flag for a single entrypoint but you can do a single file. Here's the CLI one: https://github.com/go-vela/cli/blob/master/cmd/vela-cli/main.go#L72-L80
All of the ENV configs in the CLI can be used within that file with the name
parameter in the flag. So, we could likely just add a new flag like the first one I linked and create a much simpler admin experience.
We might need to add a new flag for a single entrypoint but you can do a single file. Here's the CLI one: https://github.com/go-vela/cli/blob/master/cmd/vela-cli/main.go#L72-L80
All of the ENV configs in the CLI can be used within that file with the
name
parameter in the flag. So, we could likely just add a new flag like the first one I linked and create a much simpler admin experience.
Oh. Cool! We'll probably want to clean up the Name:
of all the options before we expose that so that they're a bit more consistent (.
vs _
vs -
vs camelCase).
Yeah, that would be a big thing because the hyphen or underscore will keep the key on the same level as the YAML file. The dot syntax turns that section into an object with keys underneath. Which also can be seen as an example in the CLI.
Description
Add
SecurityContext
to containers in the kubernetes runtime as noted in this TODO:https://github.com/go-vela/worker/blob/19081a02a335086f774e8a73f595e9efb873659d/runtime/kubernetes/container.go#L173
There are several settings that can be configured in SecurityContext, some of which can only be set on the whole Pod, others can only be set per-container, and others can be set in either context (container-level overriding pod-level).
Worker
w/ default
Step
override
(if worker allows)
Pipeline
override
(if worker allows)
volumes
need it
implicitly toggled
by
privileged
:heavy_check_mark: via CRD
:heavy_check_mark: allow list
of images via opt/env
config exists
:heavy_check_mark: pod-level only via CRD
config exists
(missing in k8s)
:heavy_check_mark: via CRD
ulimits
exists(but k8s can't do
per-step sysctls)
Value
Allow using Vela in clusters where an admissions controller blocks the creation of pods unless SecurityContext requirements are met. Make pipelines follow the principle of least-privileges: Like "privileged" only increase access if requested (and permitted).
Definition of Done
Effort (Optional)
Adding the SecurityContext should be straight-forward. But, how to configure that is not clear.
Impacted Personas (Optional)
Anyone who uses the kubernetes runtime and wants to apply SecurityContext settings.