For load-balancing purposes, it is often desirable to schedule a task onto a node with less disk space. A user might also require a certain amount of disk space to run a task, and ideally if it fails on one node, have it be automatically retried on another node that does have enough disk space.
We can make two possible enhancements:
[ ] Consider disk utilization in the scheduling policy. We could alternatively consider just Ray spilled objects.
[ ] Support arbitrary user-defined scheduling constraints, like "only schedule this task on a node with X disk space"
I'm very interested in this. Spillage is becoming a problem when it fills the disks.
Maybe a wrapper that returns information, whether it is spilling, would be great.
Description
For load-balancing purposes, it is often desirable to schedule a task onto a node with less disk space. A user might also require a certain amount of disk space to run a task, and ideally if it fails on one node, have it be automatically retried on another node that does have enough disk space.
We can make two possible enhancements:
Use case
No response