knix-microfunctions / knix

Serverless computing platform with process-based lightweight function execution and container-based application isolation. Works in Knative and bare metal/VM environments.
https://knix.io
Apache License 2.0
202 stars 26 forks source link

Enable dynamic GPU scheduling #79

Open ksatzke opened 4 years ago

ksatzke commented 4 years ago

Currently, the resource limits for KNIX components, when using helm charts for deployments, are fixed at deployment time, like so:

resources:
      limits:
        cpu: 1
        memory: 2Gi
      requests:
        cpu: 1
        memory: 1Gi

For each workflow deployment, its allowance for GPU support should also be available for configuration at workflow deployment time, to enable dynamic definition of workflow requirements to run on GPUs instead of CPUs at workflow deployment time, and for KNIX to enable scheduling of the workflow on a node which still has sufficient GPUs cores available, like so:

resources:
      limits:
        cpu: 1
        memory: 2Gi
        nvidia.com/gpu: 1 # requesting 1 GPU
iakkus commented 4 years ago

These need to be done in the feature/GPU_support_extended branch, right?

ksatzke commented 4 years ago

If we can agree on the issue, we can perform implementation using this branch to extend KNIX GPU support, right.