Open Michaelvll opened 1 year ago
+1 - ran into this today when I was trying to limit the number of tasks running on a cluster for my CPU-only job.
This also gets very confusing because we silently allow scheduling of infeasible tasks (according to resource spec). E.g., I was able to do something like:
$ sky launch -c test -- echo hi # Launches a 8 CPU cluster
$ sky launch -c test --cpus 24 -- echo hi2 # I was expecting this to fail, but it went through
Till we fix this, we should perhaps throw a warning to the user that CPUs are not respected during scheduling.
The same applies to memory parameter. I specified a requirement for 16GB of memory for exec task:
sky exec -d my-cluster --memory=16 tasks/test.yaml
However, it executed it on nodes that only have 8GB of memory.
@stolendog Was this on a k8s cluster, or on a cloud VM cluster?
on cloud VM cluster
Currently, the
--cpus
orcpus:
are not respected for job scheduling for a SkyPilot task, which makes the user with CPU tasks need to manually schedule them.