Open blublinsky opened 1 year ago
Since a component is already running in the cluster, do we still need TokenAuthentication
?
Can one invoke load_incluster_config
instead?
from kubernetes import client, config
loadedconf = config.load_incluster_config()
If TokenAuthentication
is required, how does one update a token in a scheduled operational environment? Does it need to be compiled in also?
# Create authentication object for oc user permissions
auth = TokenAuthentication(
token=token,
server="https://kubernetes.default:443",
skip_tls=True
)
try:
auth.login()
The following works:
# Create authentication object for oc user permissions
with open("/var/run/secrets/kubernetes.io/serviceaccount/token", "r") as file:
token = file.read().rstrip()
auth = TokenAuthentication(token=token, server="https://kubernetes.default:443", skip_tls=True)
try:
auth.login()
We finally made it work, but I do not think it is sustainable for the wider population. Here is what we have to do:
RUN apt update && apt install -y wget
install oc
RUN mkdir /opt/oc RUN wget -O /opt/oc/release.tar.gz https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/stable-4.11/openshift-client-linux-4.11.40.tar.gz RUN tar -xzvf /opt/oc/release.tar.gz -C /opt/oc/ && \ mv /opt/oc/oc /usr/bin/ && \ rm -rf /opt/oc
install libraries
RUN pip install --upgrade pip && pip install codeflare-sdk RUN pip install "ray[default]"==2.1.0
Allow writes for yaml files
RUN chmod -R 777 /tmp
apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: kfp-openshift-route rules:
kind: RoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: pipeline-runner-binding-mcad namespace: odh-applications subjects:
kind: ServiceAccount name: pipeline-runner namespace: odh-applications roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: mcad-mcad-controller-role
kind: RoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: pipeline-runner-binding-ray namespace: odh-applications subjects:
kind: ServiceAccount name: pipeline-runner namespace: odh-applications roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: mcad-controller-ray-clusterrole
kind: RoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: pipeline-runner-binding-route namespace: odh-applications subjects:
import kfp.components as comp from kfp_tekton.compiler import TektonCompiler import kfp.dsl as dsl from kubernetes import client as k8s_client
execute ray pipeline
def execure_ray_pipeline(token: str, # token to authenticate to cluster name: str, # name of Ray cluster min_worker: str, # min number of workers max_worker: str, # max number of workers min_cpus: str, # min cpus per worker max_cpus: str, # max cpus per worker min_memory: str, # min memory per worker max_memory: str, # max memory per worker image: str = "ghcr.io/foundation-model-stack/base:ray2.1.0-py38-gpu-pytorch1.12.0cu116-20221213-193103" ):
Ray code - basically hello world
components
ray_pipiline_op = comp.func_to_container_op( func=execure_ray_pipeline, base_image="blublinsky1/kfp-oc:0.0.2" )
Pipeline to invoke execution on remote resource
@dsl.pipeline( name='simple-ray-pipeline', description='Pipeline to show how to use codeflare sdk to create Ray cluster and run jobs' ) def simple_ray_pipeline(token: str, # token to authenticate to cluster name: str = "kfp-ray", # name of Ray cluster min_worker: str = "2", # min number of workers max_worker: str = "2", # max number of workers min_cpus: str = "2", # min cpus per worker max_cpus: str = "2", # max cpus per worker min_memory: str = "4", # min memory per worker max_memory: str = "4", # max memory per worker image: str = "ghcr.io/foundation-model-stack/base:ray2.1.0-py38-gpu-pytorch1.12.0cu116-20221213-193103" ):
if name == 'main':
Compiling the pipeline