mintproject / mint-ensemble-manager

Ensemble Manager for MINT
MIT License
0 stars 0 forks source link

Enable to Run Job Using RWO PVC #37

Closed mosoriob closed 1 month ago

mosoriob commented 1 month ago

Description

Add the ability to run a job with ReadWriteOnce (RWO) PVC by ensuring it is scheduled on the same node where the PVC is mounted. This feature will be configurable via the kubernetes.nodeaffinity option in the configuration file. If nodeaffinity is enabled, the Job will include a node affinity specification to ensure it runs on the correct node.

Requirements

New methods

  1. getNodeName: The function should retrieve the nodeName of the Pod using the POD_NAME from the environment.

    Example of how to obtain the node name using the Kubernetes JavaScript client:

    const k8s = require('@kubernetes/client-node');
    
    const kc = new k8s.KubeConfig();
    kc.loadFromDefault();
    
    const k8sApi = kc.makeApiClient(k8s.CoreV1Api);
    
    async function getNodeName(podName, namespace) {
     try {
       const res = await k8sApi.readNamespacedPod(podName, namespace);
       return res.body.spec.nodeName;
     } catch (err) {
       console.error('Error fetching pod info:', err);
       return null;
     }
    }
    
    // Usage
    const podName = process.env.POD_NAME;
    getNodeName(podName).then(nodeName => {
     console.log('Node name:', nodeName);
    });

    This will return the node name of the Pod. For example:

    Node name: hcc-nrp-shor-c6029.unl.edu
  2. Add Node Affinity to Job Spec: When kubernetes.nodeaffinity is enabled in the config file, modify the Job's Pod spec to include a nodeAffinity field, ensuring that the Job is scheduled on the same node where the PVC is mounted.

    Example of adding nodeAffinity to the Job spec using the Kubernetes JavaScript client:

    const k8s = require('@kubernetes/client-node');
    
    function addNodeAffinity(jobSpec, nodeName) {
     if (!jobSpec.spec.template.spec.affinity) {
       jobSpec.spec.template.spec.affinity = {};
     }
    
     jobSpec.spec.template.spec.affinity.nodeAffinity = {
       requiredDuringSchedulingIgnoredDuringExecution: {
         nodeSelectorTerms: [{
           matchExpressions: [{
             key: 'kubernetes.io/hostname',
             operator: 'In',
             values: [nodeName]
           }]
         }]
       }
     };
    
     return jobSpec;
    }
    
    // Usage
    const jobSpec = {
     // ... existing job specification
    };
    const nodeName = 'hcc-nrp-shor-c6029.unl.edu';
    const updatedJobSpec = addNodeAffinity(jobSpec, nodeName);

Configuration

Steps

  1. Fetch the Pod name from the environment variable POD_NAME.
  2. Use the Pod name to get the node where it is running using the Kubernetes JavaScript client.
  3. If kubernetes.nodeaffinity is enabled, include the nodeAffinity in the Job spec to schedule the Job on the same node.

Notes

mosoriob commented 1 month ago

POC test

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: shared-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

---
apiVersion: v1
kind: Pod
metadata:
  name: pod1
spec:
  nodeName: hcc-nrp-shor-c6029.unl.edu
  containers:
  - name: container1
    image: nginx
    volumeMounts:
    - name: shared-volume
      mountPath: /shared-data
  volumes:
  - name: shared-volume
    persistentVolumeClaim:
      claimName: shared-pvc

---
apiVersion: v1
kind: Pod
metadata:
  name: pod2
spec:
  nodeName: hcc-nrp-shor-c6029.unl.edu
  containers:
  - name: container2
    image: nginx
    volumeMounts:
    - name: shared-volume
      mountPath: /shared-data
  volumes:
  - name: shared-volume
    persistentVolumeClaim:
      claimName: shared-pvc

This YAML specification creates:

  1. A PersistentVolumeClaim named shared-pvc with ReadWriteOnce access mode.
  2. Two pods (pod1 and pod2) that:
    • Are scheduled on the node hcc-nrp-shor-c6029.unl.edu
    • Use the same PVC (shared-pvc)
    • Mount the shared volume at /shared-data

To test this setup:

  1. Apply the specification using kubectl apply -f <filename>.yaml
  2. Check if both pods are running on the specified node:
    kubectl get pods -o wide
  3. Verify that both pods can read and write to the shared volume:
    kubectl exec pod1 -- touch /shared-data/test-file
    kubectl exec pod2 -- ls /shared-data

This configuration should work because both pods are scheduled on the same node, allowing them to access the same RWO PVC. Remember that if you try to schedule these pods on different nodes, the second pod will remain in a Pending state due to the RWO access mode of the PVC.