Open mkurczew opened 1 year ago
This is probably the right place @mkurczew :)
To police the resources available to an agent you can make use of extra_docker_arguments
for example:
extra_docker_arguments=["--memory=1g", "--cpus=2"]
(see https://docs.docker.com/config/containers/resource_constraints/)
Hi, I have several workstations with many CPU cores and a lot of RAM. All agents run Ubuntu. I would like to run multiple ClearML agents (barebones, no k8s) on each of the workstations.
Can I somehow prevent one agent running a job from hoarding all of the resources (e.g. CPU cores) and guarantee each agent a minimum quota (prefably dynamic e.g. at least 8 cores and 1/3rd of RAM but more if avaialble)?
Or, alternatively, can I prevent the accidental scheduling of another job to the machine which is bogged down by other tasks?
Is there any way to achieve that with ClearML? Documentation suggests that the only resource agents "manage" is GPUs.
EDIT: Just started to wonder, should I file it here or under ClearML project?