SymbioticLab / FedScale

FedScale is a scalable and extensible open-source federated learning (FL) platform.
https://fedscale.ai
Apache License 2.0
388 stars 119 forks source link

Support GPU jobs in FedScale K8S deployment #187

Closed IKACE closed 2 years ago

IKACE commented 2 years ago

Why are these changes needed?

Support GPU jobs in FedScale K8S deployment, and some quality-of-life enhancements.

  1. Support training using GPU for FedScale k8s jobs
  2. Support time-sharing GPU feature so that multiple FedScale k8s jobs can share the same GPU simultaneously (no specific changes in FedScale repo, changes are made solely on k8s infra)
  3. Support checking FedScale k8s job progress interactively

Related issue number

Checks

fanlai0990 commented 2 years ago

Thanks! Looks good to me.