ai-cfia / howard

The Howard project, named after "The Godfather of Clouds" Luke Howard, orchestrates the Kubernetes-based cloud infrastructure for the Canadian Food Inspection Agency's AI lab, managing applications like Nachet, Finesse, and Louis. It prioritizes robustness, security and efficiency
https://ai-cfia.github.io/howard/
MIT License
3 stars 0 forks source link

As a DevSecOps, I would like to deploy a secondary kubernetes cluster in Azure with enhanced node pool features #203

Closed ThomasCardin closed 2 months ago

ThomasCardin commented 5 months ago

Executive summary

The current Kubernetes cluster setup lacks the flexibility to switch node pool types, which limits our capabilities for resource-intensive computing requirements. This issue proposes the creation of a secondary Kubernetes cluster in Azure specifically tailored with a node pool that include GPU capabilities, aimed at optimizing computational resources for AI-based projects.

Context

Our existing Kubernetes infrastructure does not support changes to the node pool configuration after initial setup, which restricts our ability to adapt to evolving project needs. The primary requirement for the new cluster is to support advanced computational tasks which involve heavy AI and machine learning workloads. These tasks require significantly higher computational power, including the use of GPUs. By leveraging Istio, which is natively supported in Azure Kubernetes Service (AKS), we aim to implement a multi-cluster mesh that enhances connectivity and management ease across our clusters.

TODO

References

Istio multicluster mesh Azure itsio service mesh AKS GPU workloads

ThomasCardin commented 2 months ago

Done