Closed Simon-Harris-IBM closed 4 years ago
The current plan is to run only a single GPU enabled container at a time on the backend nodes. We will allow them to use all the GPUs, and should limit the CPU and RAM such that the infrastructure components (orchestrator et al) are not starved for resources (which are relatively minimal).
Tentatively: allowed RAM = total RAM - 8GB total disk = if we unlimit this, it will be bound by the filesystem on which docker's /var/lib/docker is found. Which may be fine... need to investigate how to clean up/ garbage collect after an image is removed.
Following settings tested by Adam - runtime in the table is based on model runtime directly on the m/c (ie: no synapse):
VM | Memory (G) | CPUs | Shared Memory (G) | Run time |
---|---|---|---|---|
Todd | 32 | 7 | 16 | 6m54 |
Todd | 32 | 7 | 12 | 6m50 |
Todd | 32 | 7 | 8 | 6m36 |
Todd | 48 | 7 | 12 | 6m41 |
Todd | 32 | 6 | 12 | 8m15 |
Simon | 32 | 7 | 12 | 7m1 |
Simon | 32 | 7 | 12 | 6m50 |
Based on this, we've decided to go with the following settings:
Attempts to limit cpu consumption to 87% have not worked using the arguments documented in docker-python api docs: https://docker-py.readthedocs.io/en/stable/containers.html. I'm not so concerned about this as CPU spikes to 100% only infrequently during a run.
With the above settings, inference using the example pytorch model takes approx 10m30s from submission to completion using Synapse.
Need to tune cpu/memory/disk/gpu's allocated to run the submitted docker container.
Code in run_docker.py: