Open vlibov opened 5 months ago
Thank you for opening your first issue here! 🛠
To address the issue of configuring SecurityContext
for Pods/Containers started by Flyte, you can leverage the PodTemplate feature and make specific adjustments to the Flyte configuration. Here are the steps:
Disable SYS_PTRACE Capability: According to this discussion, the SYS_PTRACE capability is required for Flyte Sidecar but can be configured. You can modify the security context to drop this capability.
Apply PodTemplates to InitContainers: As per this GitHub issue, you can configure PodTemplates to apply to all containers, including initContainers. This ensures that the security context settings are uniformly applied.
Here is an example of how you can configure the security context in your PodTemplate:
apiVersion: v1
kind: PodTemplate
metadata:
name: secure-flyte-pod
template:
spec:
securityContext:
runAsNonRoot: true
fsGroup: 65534
runAsUser: 1001
fsGroupChangePolicy: "Always"
seLinuxOptions:
type: spc_t
containers:
- name: main-container
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
initContainers:
- name: flyte-copilot-downloader
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
Would you like more details on how to implement these configurations?
Sources:
This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.
Mention @runllm to ask follow-up questions. See runllm.com for more!
Including a configuration for initContainer in the podtemplate as suggested above leads to a duplication error:
Workflow[kodi:development:workflows.kodi-training.kodi_training_wf] failed. RuntimeExecutionError: max number of system retry attempts [11/10] exhausted. Last known status message: failed at Node[n0]. RuntimeExecutionError: failed during plugin execution, caused by: failed to execute handle for plugin [container]: [Invalid] failed to create resource, caused by: Pod "aqmll4kts4gghgvgqljv-n0-0-n0-n1-0-n0-0" is invalid: spec.initContainers[1].name: Duplicate value: "flyte-copilot-downloader"
Thus, flyte does not check for already existing initContainer definitions in the template and tries to a new one, hence the name conflict.
Setting "default" as the name of initContainer (as for the main containers) will not work, as the names have to be unique in the template.
@vlibov do you plan to contribute the proposed changes?
Hi @davidmirror-ops, yes!
The SYS_PTRACE is added in flyteplugins\go\tasks\pluginmachinery\flytek8s\copilot.go
Function AddCoPilotToContainer(..)
c.SecurityContext.Capabilities.Add = append(c.SecurityContext.Capabilities.Add, pTraceCapability)
If we want to make this configurable - what is the best way of propagating the corresponding setting?
@EngHabu do you think, instead of even making it configurable, we could drop the SYS_PTRACE capability from the copilot container?
Any news on this @davidmirror-ops @EngHabu ? In the meantime I did the necessary changes here in my forked repo: https://github.com/vlibov/flyte/tree/podsecurity (removing SYS_PTRACE altogether - i.e. hardcoding the condition to false and including the SecuritySettings also for initContainers). What is the process from here (if the changes look right) - simply creating a pull request for the official repo?
@vlibov feel free to submit the PR, we can then review and discuss. Thanks!!
@davidmirror-ops Done, see https://github.com/flyteorg/flyte/pull/5556 Looking forward to any comments!
Motivation: Why do you think this is important?
To run workloads in a security-hardenend kubernetes cluster, the pods and containers must fullfill certain requirements, in particular regarding the SecurityContext fields (pod level, container level).
It is possible to set those in flyte e.g. by utilizing the PodTemplate native k8s resource. However it seems that these settings won't be applied to the initContainers (flyte-copilot-downloader). Furthermore, there are some settings which are applied to the main container (addéd capability SYS_PTRACE) without a possibility to disable it.
This prevents running flyte workflows in a hardened namespace, which is somewhat unfortunate.
It would be nice to be able to configure those, i.e. apply the PodTemplate also for the initContainers and make the SYS_PTRACE capability optional.
Goal: What should the final outcome look like, ideally?
Ultimately we need to
Describe alternatives you've considered
One alternative is to run flyte-workloads in a separate namespace which has more elevated rights (i.e. less secure) and restrict the users from creating workloads in the namespace direclty. However this still poses a security risk, because the user still can create workloads indirectly (e.g. create a malicious python code or image and start it as a flyte worklfow).
Propose: Link/Inline OR Additional context
No response
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?