Closed fcasson closed 2 years ago
I think adding an optional memoryPerCpu
as an alternative to memory
will definitely be needed.
The hard part is working out how to extend the existing schema in a sensible way, still working on possibilities for this...
In my first version of this I've added memoryPerCpu
and cpusRange
to resources
. With this either cpus
or cpusRange
must be specified. With cpusRange = [16, 32]
the job will run using anywhere between 16 and 32 CPU cores, with the largest possible value preferred.
A related change is that the JSON description for a running or complete job now contains provisionedResources
in the execution
section so users can find out externally what resources the job has, e.g.
"provisionedResources": {
"cpus": 2,
"memory": 4
"nodes": 1
},
Within a job as usual the environment variables PROMINENCE_CPUS
and PROMINENCE_MEMORY
will correctly report the available resources.
This is all not yet available with the production API, but has been successfully tested. The next step is to deal with multi-node jobs.
Also added a third option for CPUs: cpusOptions
, for example with cpusOptions = [14, 28]
. WIth this either 14 or 28 CPUs will be used, with 28 preferred. One user was basically doing this manually if a job requesting 28 CPUs was idle for too long.
Added an optional totalCpusRange
to resources
. This is an alternative to nodes
, allowing users to specify a total number of CPUs but not the number of nodes. Initially it will try to use the maximum number of CPUs with minimum number of nodes, which I think would be the preferred option for most use cases.
To summarize, the resources
part of the JSON job description contains:
memory
or memoryPerCpu
: specify fixed memory per node (memory
) or memory per CPU (memoryPerCpu
)cpus
, cpusRange
or cpusOptions
: specify a fixed number of CPUs per node (cpus
), a possible range of numbers of CPUs (cpusRange
) or a list of possibilties (cpusOptions
)nodes
or totalCpusRange
: specify either a fixed number of nodes (nodes
) or a range for the total number of CPUs across all nodes (totalCpusRange
)The options as mentioned above are available. Late closure.
Okay thanks. I assume this is currently available only from the API not from CLI (that's fine for us) - some examples / docs on the JSON layout would help.
It's available from the latest version of the CLI, which is documented. But good point - I'll try to improve the API documentation and make sure there are examples, sometime today...
Some examples of different resources possibilities are now in the second half of: https://prominence-eosc.github.io/docs/job-description-files#
Thanks!
In the context of multi-node MPI jobs and dynamic and transparent hardware provisioning, users should be able to provide ranges for
This should allow resources to be allocated more flexibly when users are unaware of the properties of available resources
The first preference for number or processes would be the maximum of the range. The preference for procs per node is less important, but if one is needed, this could also be the maximum.
Number of nodes requirement can then be inferred by the promience server based on available resources. Memory per node requirement would also need to be inferred or instead provided in terms of memory per process.