Open fcasson opened 2 years ago
Even better, for multi-node MPI jobs, allow to set procs-per-node and nodes automatically based on most available resources
Could you please clarify this? Do you want to be able to specify a total number of procs, and then it will automatically run across whatever number of nodes is required in order to give this number?
Yes, exactly. That would be the most common use case (and is in line with the philosophy that physical hardware is not something the users need to think about)
I think it is useful both to be able to probe resources, as well as to be able to ignore them! I opened #156 for the latter since these should be seperate issues.
I suppose having more information than just quantities of resources would also be useful, e.g. information about the types of CPUs could be useful if you want to run optimised code (e.g. AMD/Intel, what type of processor, ...) Along with this the ability to restrict jobs to specific processors (e.g. if you have code optimised for Intel with AVX512).
In first iteration I just need to know CPUs per node, Memory per node, Site, and number of nodes of each. This could be via CLI or on the Grafana pages.
Just the same information we are already getting from you by email, will allow us to target jobs while we wait for #156 which I imagine will take a bit longer.
Will something in this form be ok to begin with?
[
{
"cpus":8,
"memory":32,
"site":"OpenStack-STFC",
"nodes":1
},
{
"cpus":60,
"memory":244,
"site":"OpenStack-STFC",
"nodes":4
},
{
"cpus":16,
"memory":32,
"site":"OpenStack-TUBITAK",
"nodes":8
},
{
"cpus":16,
"memory":93,
"site":"OpenStack-UNIV-LILLE",
"nodes":2
}
]
I will probably need to have 2 sections, one for existing resources and another for potential resources (i.e. those which could be generated dynamically if needed). I think the dynamic resources should be separate because it's not 100% certain it will be possible to get them, e.g. if you want a lot of CPUs in a single node and the cloud itself doesn't have enough free resources.
Yes that would be a good start. If that is total resources allocated, second step would be to see how many of each type are free (or maybe that is what you are already suggesting)
So something like this perhaps?
[
{
"capacity":{
"cpus":16,
"memory":32
},
"free":{
"cpus":0,
"memory":0
},
"site":"OpenStack-TUBITAK"
},
{
"capacity":{
"cpus":60,
"memory":244
},
"free":{
"cpus":0,
"memory":126
},
"site":"OpenStack-STFC"
},
{
"capacity":{
"cpus":16,
"memory":32
},
"free":{
"cpus":0,
"memory":0
},
"site":"OpenStack-TUBITAK"
},
{
"capacity":{
"cpus":16,
"memory":32
},
"free":{
"cpus":2,
"memory":4
},
"site":"OpenStack-TUBITAK"
},
{
"capacity":{
"cpus":8,
"memory":32
},
"free":{
"cpus":8,
"memory":32
},
"site":"OpenStack-STFC"
},
{
"capacity":{
"cpus":60,
"memory":244
},
"free":{
"cpus":0,
"memory":126
},
"site":"OpenStack-STFC"
},
{
"capacity":{
"cpus":16,
"memory":32
},
"free":{
"cpus":0,
"memory":0
},
"site":"OpenStack-TUBITAK"
},
{
"capacity":{
"cpus":60,
"memory":244
},
"free":{
"cpus":2,
"memory":130
},
"site":"OpenStack-STFC"
},
{
"capacity":{
"cpus":16,
"memory":93
},
"free":{
"cpus":2,
"memory":66
},
"site":"OpenStack-UNIV-LILLE"
},
{
"capacity":{
"cpus":16,
"memory":32
},
"free":{
"cpus":2,
"memory":4
},
"site":"OpenStack-TUBITAK"
},
{
"capacity":{
"cpus":16,
"memory":32
},
"free":{
"cpus":2,
"memory":4
},
"site":"OpenStack-TUBITAK"
},
{
"capacity":{
"cpus":60,
"memory":244
},
"free":{
"cpus":0,
"memory":126
},
"site":"OpenStack-STFC"
},
{
"capacity":{
"cpus":16,
"memory":32
},
"free":{
"cpus":0,
"memory":0
},
"site":"OpenStack-TUBITAK"
},
{
"capacity":{
"cpus":16,
"memory":32
},
"free":{
"cpus":0,
"memory":0
},
"site":"OpenStack-TUBITAK"
},
{
"capacity":{
"cpus":16,
"memory":93
},
"free":{
"cpus":0,
"memory":62
},
"site":"OpenStack-UNIV-LILLE"
}
]
Now it is less clear to me what are the per node resources, What about simply
{
"cpus":8,
"memory":32,
"site":"OpenStack-STFC",
"nodes":5
"nodes_free": 2
},
What about nodes which are only partially free?
Okay, I didn't anticipate that case (are most of the virtual nodes exclusive use or shared use?). There are two use cases we would use this information for
Is there something working here or on #156 that we could start to use?
If #156 is not possible yet, I at least need to decide a default value for procs-per-node in our multi-node requests - use case 2. above. Maybe I should just plump for 16?
In reality at the moment 16 is a reasonable default.
The first version of the prominence resources
command gives output like this:
$ prominence resources
Existing resources
Total Free
Cpus Memory Cpus Memory Site
60 244 0 124 OpenStack-STFC
8 32 8 32 OpenStack-STFC
60 244 0 124 OpenStack-STFC
32 94 0 30 OpenStack-TUBITAK
32 94 32 94 OpenStack-TUBITAK
60 244 12 148 OpenStack-STFC
16 47 0 15 OpenStack-TUBITAK
16 47 0 15 OpenStack-TUBITAK
16 32 16 32 OpenStack-TUBITAK
60 244 12 148 OpenStack-STFC
32 94 16 62 OpenStack-TUBITAK
64 472 64 472 OpenStack-MetaCentrum
Potential resources
--coming soon--
Each line corresponds to an existing worker node which is allowed to run jobs by the user making the request.
I will make this available on Monday. The next step will be to also list potential resources, i.e. resources which don't exist yet but could be created. This won't be 100% accurate as no private or public clouds can tell you if it's definitely possible to create a VM of a particular size, unless you actually do it.
Would it be most helpful to sort the resources by free cpus? If the list gets long, identical entries could be grouped somehow
EDIT: prominence resources | uniq -c
works well (with client v0.17.0)
Yes, I had already thought about sorting by free cpus (descending) to make it clearer. And probably don't display items with no free cpus by default.
Strawperson CLI syntax:
prominence resources
Even better, for multi-node MPI jobs, allow to set procs-per-node and nodes automatically based on most available resources