skypilot-org / skypilot

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
https://skypilot.readthedocs.io
Apache License 2.0
6.82k stars 513 forks source link

[k8s] fix managed job issue on k8s #4357

Closed nkwangleiGIT closed 5 days ago

nkwangleiGIT commented 1 week ago
  1. initialize a new dict if the destination is None, and add the source key/value to it
  2. use the cpu/memory to k8s resources limits, or the pod will fail to create if there is LimitRange configured in the namespace, as the default cpu/memory maybe smaller than the requests.

Tested (run the relevant ones):