Kruize reads CPU & memory usage data from the provided data source and comes up with the CPU and Memory right sizing recommendation. In a similar way it would be good to have the GPU MIG partition sizing recommendation for container which utilise GPU's
Examples or references
Most of the ML workloads need GPU power and advanced GPU's from NVIDIA support MIG (Multi instance GPU's) where a single Physical GPU can be partitioned into multi instances of virtual or logical GPU's which can be configured and shared across multiple containers. Ampere (from A30) and Hopper series GPU's provide this feature.
Suggest a solution
Record the GPU related metrics
Process the metrics along with CPU and Memory metrics
Describe the feature
Kruize reads CPU & memory usage data from the provided data source and comes up with the CPU and Memory right sizing recommendation. In a similar way it would be good to have the GPU MIG partition sizing recommendation for container which utilise GPU's
Examples or references
Most of the ML workloads need GPU power and advanced GPU's from NVIDIA support MIG (Multi instance GPU's) where a single Physical GPU can be partitioned into multi instances of virtual or logical GPU's which can be configured and shared across multiple containers. Ampere (from A30) and Hopper series GPU's provide this feature.
Suggest a solution
Additional Context
None