Closed eero-t closed 1 month ago
I'll look at this issue. Could you provide more information about what issue the empty securityContexts cause? I've verified this runs fine with Gaudi-device-plugin without any special priviledge settings, will learn more from your link.
Could you provide more information about what issue the empty securityContexts cause?
Such pods cannot be run in clusters with more strict pod security policies (see the "pod-security-standards" link).
In general all unnecessary container privileges should be dropped to reduce likelihood of subverted containers being also able to take over their host. "Defense in depth" etc.
The manifest files are generated by helmchart from GenAIInfra repo. We'll figure out minimum privileges for the workload containers running successfully, and create PR there first, both for xeon case & gaudi xeon, then later populate them here
setting runAsUser to non-root user will results the tgi pod(image: ghcr.io/huggingface/text-generation-inference:1.4) crash, the log shows something like: RuntimeError: cannot cache function 'create_fsm_info': no locator available for file '/opt/conda/lib/python3.10/site-packages/outlines/fsm/regex.py'
Will investigate more into this
The pod security has been added into the helm chart(see PR opea-project/GenAIInfra#133). The manifests here are generated by helm chart from GenAIInfra repo. Currently, we're in the process of discussion how to generate and use those kind of manifest with GMC and is expecting quite major changes to the k8s manifest. So we'll defer the manifest update here until that is resolved. Please see issue opea-project/GenAIInfra#129 for details tracking.
Thanks, the merged PR looks good, but there are few things that could be improved:
/mnt
is not a good host mount point. Dirs mounted from host should be very specific (e.g. /mnt/opea-models
), not top level host directories (in worst case, /mnt
could e.g. include remote home directory mount points or other security sensitive data)Thanks, the merged PR looks good, but there are few things that could be improved:
/mnt
is not a good host mount point. Dirs mounted from host should be very specific (e.g./mnt/opea-models
), not top level host directories (in worst case,/mnt
could e.g. include remote home directory mount points or other security sensitive data)- Anything that does not write to disk, should use read-only root fs setting
PR opea-project/GenAIInfra#153 should have resolved this
PR opea-project/GenAIInfra#153 should have resolved this
Yes, looks good! Any idea when those changes get also to this (GenAIExamples
) repository?
The newly added manifests for CodeGen/ChatQnA/DocSum/CodeTrans has security context setting now
closed as CodeGen/ChatQnA/DocSum/CodeTrans has security context setting now
I would expect seeing pod container
securityContext
s like this:And
runAsUser
setting for something else than the default root [1].However,
securityContext
s in this project are either not set, or empty:For more info, see:
[1] At least for Xeon. Device access (e.g. for Gaudi) may require root user if container runtime is not properly configured: https://kubernetes.io/blog/2021/11/09/non-root-containers-and-devices/