project-codeflare / codeflare-sdk

An intuitive, easy-to-use python interface for batch resource requesting, access, job submission, and observation. Simplifying the developer's life while enabling access to high-performance compute resources, either in the cloud or on-prem.
Apache License 2.0
22 stars 44 forks source link

Stack Installation Documentation (w/o ODH) #143

Open Maxusmusti opened 1 year ago

Maxusmusti commented 1 year ago

Add proper documentation for installing/using the CodeFlare stack without ODH (within the project-codeflare repo)

tedhtchang commented 1 year ago

Tried @astefanutti PR to install codeflare on local kind. Works fine but kind does not support route crd. the cluster.up() will fail on Kind but CRC worked. The make deploy has a lot of dependencies so need to translate that into something like purely oc:

export KUBERAY_VERSION=v0.5.0 && kubectl create -k "github.com/ray-project/kuberay/ray-operator/config/default?ref=${KUBERAY_VERSION}&timeout=90s"
oc create -k  https://github.com/project-codeflare/codeflare-operator/config/default?ref=v0.0.4
cat <<EOF | oc apply -n codeflare-system -f -
apiVersion: codeflare.codeflare.dev/v1alpha1
kind: MCAD
metadata:
  name: mcad
spec:
  controllerResources: {}
  controllerImage: quay.io/project-codeflare/mcad-controller:main-v1.31.0
EOF

cat <<EOF | oc apply -n codeflare-system -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: mcad-controller-rayclusters
rules:
  - apiGroups:
      - ray.io
    resources:
      - rayclusters
      - rayclusters/finalizers
      - rayclusters/status
    verbs:
      - get
      - list
      - watch
      - create
      - update
      - patch
      - delete
EOF

cat <<EOF | oc apply -n codeflare-system -f -
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: mcad-controller-rayclusters
subjects:
  - kind: ServiceAccount
    name: mcad-controller-mcad
    namespace: codeflare-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: mcad-controller-rayclusters
EOF
tedhtchang commented 1 year ago

Installation update:

Install KinD

https://kind.sigs.k8s.io/docs/user/quick-start/#installation

Create a cluster using KinD

cat <<EOF | kind create cluster --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  kubeadmConfigPatches:
  - |
    kind: InitConfiguration
    nodeRegistration:
      kubeletExtraArgs:
        node-labels: "ingress-ready=true"
  extraPortMappings:
  - containerPort: 80
    hostPort: 80
    protocol: TCP
  - containerPort: 443
    hostPort: 443
    protocol: TCP
EOF

Deploy the NGINX Ingress Controller with SSL passthrough support

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/kind/deploy.yaml

Turn on SSL Passthrough

kubectl patch deploy --type json --patch '[{"op":"add","path": "/spec/template/spec/containers/0/args/-","value":"--enable-ssl-passthrough"}]' ingress-nginx-controller -n ingress-nginx

Verify log has Starting TLS proxy for SSL Passthrough

kubectl logs deploy/ingress-nginx-controller -n ingress-nginx

MCAD

git clone https://github.com/project-codeflare/multi-cluster-app-dispatcher
cd multi-cluster-app-dispatcher
helm install mcad --set image.repository=quay.io/project-codeflare/mcad-controller --set image.tag=stable deployment/mcad-controller
kubectl apply -f doc/usage/examples/kuberay/config/xqueuejob-controller.yaml

Kuberay

export KUBERAY_VERSION=v0.6.0
kubectl create -k "github.com/ray-project/kuberay/ray-operator/config/default?ref=${KUBERAY_VERSION}&timeout=90s"

Example Appwrapper

kubectl apply -f  [local_interactive_aw.txt](https://github.com/project-codeflare/codeflare-sdk/files/12421615/local_interactive_aw.txt)

Test

Using codeflare-sdk to generate the AW depends on PR #251

cd codeflare-sdk
# checkout the PR as a branch
jupyter lab demo-notebooks/interactive/local_interactive.ipynb

or on a LInux VM with root user: jupyter lab --ip 0.0.0.0 --no-browser --allow-root

Verify

image