Closed grondo closed 5 years ago
Thanks, good points. I don't want to stall your progress with my minor questions. However it has been useful for my understanding, so thanks!
It has also been helpful to flesh out more details as well. Thanks!
It might help me if we could work through how some simple use cases might work using GraphML from above... e.g. find the correct layout of tasks across R for simple task slot shapes (e.g. 1 core, 1 socket, 1 socket, 1 core, etc).
Yeah, I will look more deeply into slot HOWTOs; so I will keep you posted. I had struggled with this under the ECP milestone due coming and punted it by not considering it for that work. So, I need to go back and spend some quality time to work on that logic. I think the logic of finding slots from R is its own topic, though. Regardless of what R spec we use, finding slots from R requires its own effort IMHO.
I believe the results of this discussion is:
We agreed that this should be the area of near-term investigation. For now, I call this as "slot support for task layout and resource grouping." This is a co-investigation between execution and scheduling services.
Need experience on slot support both for the resource matching logic (scheduler)
FYI -- I have been prototyping slot support using my resource-query
and at this point I believe this is doable -- essentially first detecting the corresponding resource level for slotting and calling into a special slot-aware subtree walk.
Notes from our September meeting about the format for R: https://github.com/flux-framework/rfc/wiki/Brainstorming-on-R
@grondo and @SteVwonder: As we discussed, I will post some R representations to push forward our R spec discussion. At this point, it should be pretty easy for our resource infrastructure to emit a fully concretized R
with various formats. So we can play with various formats.
Given a tiny machine (called tiny0) and the following jobspec:
version: 1
resources:
- type: cluster
count: 1
with:
- type: rack
count: 1
with:
- type: node
count: 1
with:
- type: slot
count: 1
label: default
with:
- type: socket
count: 1
with:
- type: core
count: 1
- type: memory
count: 4
The resource infrastructure can generate:
- type: cluster
count: 1
name: tiny0
id: 0
with:
- type: rack
count: 1
name: rack0
id: 0
with:
- type: node
count: 1
name: node0
id: 0
with:
- type: socket
count: 1
name: socket0
id: 0
exclusive: true
with:
- type: memory
count: 2
name: memory1
id: 1
exclusive: true
- type: memory
count: 2
name: memory0
id: 0
exclusive: true
- type: core
count: 1
name: core0
id: 0
exclusive: true
A few Issues:
with
implies a multiplicative edge, and I found that in a fully concretized resource set this semantics can be confusing. What happens if an intermediate vertex's count is multiple? This can lead to an incorrect interpretation of the size of a target vertex. I think introducing the edge
key whose semantic simply is associative would be much cleaner.
With the current format, I can't add information that the resource infrastructure uses to manage the graph complexity. For example, the infrastructure allows you to add what subsystem(s)(e.g., containment, network connective and power) a vertex or an edge belongs to. But as is, I can't pass that information and this can affect nested instances. A question is, can we add this optional data into R
in a way that doesn't increase the overhead of the remote execution system's R parsing much.
slot
and rank
are missing. As we discussed before, in a fully concretized resource set, it seems better to encode that information as attributes of their resource vertices. There is an opposite problem in this case. Can we add this info such that it won't affect the overhead of the scheduler's R parsing much.
Next steps:
Once I get your first feedback, I can show some examples of having slot
and rank
added to R
using resource matching module with hwloc reader.
Let's also discuss what are the implications of adding edge
key and also optional info key.
I just saw from a next step from @SteVwonder's excellent note:
First implement the emitter in resource matching service Emit simple examples as JSON Then work on a reader
I think the JSON representation of the above R looks like:
{
"type": "cluster",
"count": 1,
"name": "tiny0",
"id": 0,
"with": {
"type": "rack",
"count": 1,
"name": "rack0",
"id": 0,
"with": {
"type": "node",
"count": 1,
"name": "node0",
"id": 0,
"with": {
"type": "socket",
"count": 1,
"name": "socket0",
"id": 0,
"exclusive": true,
"with": {
"type": "socket",
"count": 1,
"name": "socket0",
"id": 0,
"exclusive": true,
"with": [{
"type": "core",
"count": 1,
"name": "core0",
"id": 0,
"exclusive": true
},
{
"type": "memory",
"count": 2,
"name": "memory1",
"id": 1,
"exclusive": true
},
{
"type": "memory",
"count": 2,
"name": "memory0",
"id": 0,
"exclusive": true
}
]
}
}
}
}
}
@dongahn, thanks for pushing this forward. Here are some thoughts after our quick discussion yesterday. I don't claim to have any answers or great insight, but I want to keep the discussion moving forward.
R_lite
format for now, meanwhile extending the format step-wise with new versions as requirements evolve. To that end, perhaps we want to tweak R_lite
to include version and or version_name fields?Some thoughts on your "issues" above:
with implies a multiplicative edge, and I found that in a fully concretized resource set this semantics can be confusing. What happens if an intermediate vertex's count is multiple? This can lead to an incorrect interpretation of the size of a target vertex. I think introducing the edge key whose semantic simply is associative would be much cleaner.
I trust that you've thought this through better than I, but I don't see how the multiplicative with
edge has confusing semantics. In a fully concretized resource set, you can still emit with:
links where count
is 1
and not have the multiplicative property, however allowing count > 1
for identical resource types in the set would allow for much smaller emitted R
.
I agree, however, the right apporach might be to start with edge:
as you suggest, then add special case links like with:
that are handled in very specific ways. For example, in a fully "concretized" resource set, no resources are actually identical, because we are assigning or describing distinct resources. So a multiplicative edge like with:
would need extra information about how to expand the count > 1
items into count
distinct items. E.g. you might have a idset:
instead of ids
, and for example, 4 cores on a socket might become:
{ "type": "socket",
"id": 0,
"exclusive": false,
"with": [
{ "type": "core",
"count": 4,
"idset": "0-3",
"exclusive": true
}
]
}
A parser of this version of R
would know to expand this with:
directive into:
{ "type": "socket",
"id": 0,
"exclusive": false,
"with": [
{ "type": "core",
"name": "core0",
"count": 1,
"id": 0,
"exclusive": true
},
{ "type": "core",
"name": "core1",
"count": 1,
"id": 1,
"exclusive": true
},
{ "type": "core",
"name": "core2",
"count": 1,
"id": 2,
"exclusive": true
},
{ "type": "core",
"name": "core3",
"count": 1,
"id": 3,
"exclusive": true
},
]
}
or whatever the equivalent with edge:
would be. This is similar to what the original RDL experiment did, and it allowed for very compact representations of thousands of homogeneous nodes. I'm not saying we need this support now, but it would perhaps be a point of future evolution for the R format.
A question is, can we add this optional data into R in a way that doesn't increase the overhead of the remote execution system's R parsing much.
What we want is a way to add annotation information to R
, preferably I think (mostly) outside of the main JSON for R.
We have already talked about allowing generic attributes for resources in R
as annotation outside of normal R use cases. Initially, what if advanced schedulers used this information to reference resources outside of the main R
hieararchy. Then the scheduler could embed their extra topology and graph information in one or more completely separate sections of R, which would be considered opaque to anything but that schedulers components.
This would mean that the top level R
becomes an object, and the R
resource format itself would be stored in a well known key within this object (say resources:
). Any other emitter of R
could embed a new key with extra information which would be ignored by most parsers, e.g.
{
"resources": [
{
"type": "cluster",
"name": "test",
"id": 0,
"with": [
{
"type": "node",
"name": "node",
"id": 112,
"with": [
{
"type": "socket",
"name": "socket",
"id": 0,
"with": [
{
"type": "core",
"name": "core",
"id": 1
}
]
}
]
}
]
}
],
"sched": {
"grug": "xml string..."
}
}
The sched.grug
xml could possibly reference resources from resources
array either by first embedding uuids into each resource, e.g. "attributes": { "uuid": ... }
and reference unique resources that way, or resources could be back referenced some other way, e.g. perhaps by the resource uri something like cluster0/node112/socket0/core1
. There are probably lots of other solutions as well.
slot and rank are missing.
This one requires more thought. slot
and rank
would be used directly by flux-core services like the execution system and job shell, so they should be encoded as first-class members of R. We'd have to think through if these items are encoded better in the resources
section directly or if it might be easier in some separate section of R
. I don't have any good ideas here.
@grondo: Sorry for the late response. Let me try to reason about this one by one.
I trust that you've thought this through better than I, but I don't see how the multiplicative with edge has confusing semantics. In a fully concretized resource set, you can still emit with: links where count is 1
Maybe I'm overthinking this, but I think this can be confusing if we allow a resource pool vertex with count > 1
to be an intermediate vertex. Say you want to represent 2 compute nodes as a resource pool under which you have 76 cores. A graph can certainly model this: One vertex with two compute nodes aggregated as a pool; Then, from this vertex you have 76 out-edges to core
vertices.
Now, if your jobspec is,
version: 1
resources:
- type: cluster
count: 1
with:
- type: rack
count: 1
with:
- type: slot
count: 1
label: default
with:
- type: node
count: 2
with:
- type: core
count: 1
And if you emit vertices and edges in a most simplistic way, you would get
- type: cluster
count: 1
name: tiny0
id: 0
with:
- type: rack
count: 1
name: rack0
id: 0
with:
- type: node
count: 2
name: node1
id: 1
exclusive: true
with:
- type: core
count: 1
name: core71
id: 71
exclusive: true
- type: core
count: 1
name: core70
id: 70
exclusive: true
But given with
being multiplicative, I think this can become ambiguous to interpret. In this case, With
should used interpreted as associative
, but that is the semantics of edge
. We can certainly mandate with
in the concretized graph shall be interpreted as associative. But then in preparation for when we need to support compression, there are benefits to maintain the original multiplicative
semantics of the with
key.
In a fully concretized resource set, you can still emit with: links where count is 1 and not have the multiplicative property
Yes, I agree. In this case, since multiply-by-1 is the same as being associative, this should be okay. It is just that a full concretization is only possible when you had no coarsening in the scheduler's graph data, which in generally cannot be assumed. We can require it in the resource data model of Flux but I am unclearly it is a good idea. The high-end systems in a distant future are headed towards the concept of "aggregated resources" where they no longer have the concept of "real" compute nodes...
I agree, however, the right approach might be to start with edge: as you suggest, then add special case links like with: that are handled in very specific ways.
I completely agree with you! I will propose the edge
key somewhere so that I can use edge
for the initial R. I think @trws once proposed this as part of the canonical jobspec, so I can look at the past commits to retrieve and review it. Seems we agreed that we want to keep the multiplication
property of with
and use it as a way to condense R
later on.
A question is, can we add this optional data into R in a way that doesn't increase the overhead of the remote execution system's R parsing much. What we want is a way to add annotation information to R, preferably I think (mostly) outside of the main JSON for R.
This sounds reasonable. One thing is, though, these extra data are currently not that much of information so not sure if there will be much benefit to have a separate section for them at least at this point.
Why don't I generate a few examples where those scheduler
specific data are directly emitted into each vertex and edge as their "attributes" and further our discussions?
slot and rank are missing.
This one requires more thought. slot and rank would be used directly by flux-core services like the execution system and job shell, so they should be encoded as first-class members of R. We'd have to think through if these items are encoded better in the resources section directly or if it might be easier in some separate section of R. I don't have any good ideas here.
Again some examples seem to help further our discussions. In those future examples, let me emit all of them into first-class members of R.
Seems the next steps should be:
R_lite
. This is needed for the initial support for our new remote execution service although some extension will be required;slot
, rank
and scheduler data directly embedded in R
for further evaluation@grondo: We didn't have whole lot on the edge key back then. A commit only has it:
*edge*::
**XXX**: need specification for other "edge match descriptors"
My initial thought: perhaps we can define the edge
key as:
"dflt_edge_attr": { “subsystem”: string, “relationship”: string }
"edge":
?"attr": { “subsystem”: string, “relationship”: string }
?"in": ( $vertex_label )
"out": ( $vertex_label | $resource_vertex )
edge
: The edge
key SHALL indicate an edge from a resource vertex to another resource.
If the in
key is present within it, this SHALL be an edge from the resource vertex referred to by its vertex label to the resource vertex captured by the out
key. If in
is omitted, the resource vertex containing the edge
key is assumed to be the in
vertex.
The out
key SHALL either refer to the destination vertex with a vertex label or be a list conforming to the resource vertex specification. For latter, the each resource vertex appears in this list is assumed to have an edge of the same type from the in
vertex.
If attr
is present, it will describe the subsystem to which this edge belongs and the relationship between the two connecting resource vertices. If omitted, it will inherit the default edge attributes.
With something like this, R
in a typical case can look like:
dflt_edge_attr: { subsystem: containment, relationship: contains }
resource:
- type: rack
count: 1
id: 0
edge:
out:
- type: node
name: node7
id: 7
count: 1
edge:
out:
- type: socket
count: 1
edge:
out:
- type: core
name: core0
id: 0
- type: memory
name: memory0
count: 4
unit: GB
For latter, the each resource vertex appears in this list is assumed to have an edge of the same type from the in vertex.
BTW, if a resource vertex has out edges of different types to other resources, we will have to end up emitting the edge key multiple times from the same in
vertex. Are duplicate keys allowed in both YAML and JSON?
Are duplicate keys allowed in both YAML and JSON?
I believe keys in both YAML and JSON have to be unique.
we will have to end up emitting the edge key multiple times from the same in vertex.
I'm not sure I understand. in:
and out:
are valid keys for edge:
(seems like it could better be called edges:
), and both are lists, then you can specify any number of in
or out
edges for a given resource vertex?
We shouldn't bend over backwards to make the resources section from jobspec fit our R
though. If it would be better to first emit vertices, then edges in a separate object, perhaps we should allow for that.
"dflt_edge_attr": { “subsystem”: string, “relationship”: string }
I suggest we don't add top-level keys in R
like this. For readabillity, extensibility (and a bit of sanity), I'd suggest something namespaced, like
defaults:
edge:
attrs: { "subsystem": "containment", "relationship": "contains" }
I'm not sure I understand. in: and out: are valid keys for edge: (seems like it could better be called edges:), and both are lists, then you can specify any number of in or out edges for a given resource vertex?
Great idea. Let me play with it. Love the idea of edges
plural.
We shouldn't bend over backwards to make the resources section from jobspec fit our R though. If it would be better to first emit vertices, then edges in a separate object, perhaps we should allow for that.
Yes. We almost have to allow this. It just that I also wanted to make the similar structure of jobspec's resource section also a valid R format.
I suggest we don't add top-level keys in R like this. For readabillity, extensibility (and a bit of sanity), I'd suggest something namespaced, like
Yup! I was thinking along the same line.
Great idea. Let me play with it. Love the idea of edges plural.
defaults:
edge:
attrs: { subsystem: containment, out: contains, in: in }
resource:
- type: rack
count: 1
id: 0
edges:
- out:
- type: node
name: node7
id: 7
count: 1
edges:
- out:
- type: socket
count: 1
edges:
- out:
- type: core
name: core0
count: 1
id: 0
- type: memory
name: memory0
count: 4
unit: GB
# how to annotate different out-edge type
- out:
type: foo
name: foo1
count: 1
I like this direction. But I am not clear what is the best way to annotate an out edge when it has a different attribute then the default. Now I remember I used the singular edge
key because of this. @grondo: any idea?
edges:
- out:
? attrs: {}
vtx: $resource_vertex
We can also do it this way at the expense of being verbose...of course.
with
implies a multiplicative edge, and I found that in a fully concretized resource set this semantics can be confusing. What happens if an intermediate vertex's count is multiple? This can lead to an incorrect interpretation of the size of a target vertex. I think introducing theedge
key whose semantic simply is associative would be much cleaner.
Note that, at least early on in here, that was meant to be dealt with by range expansion on names and or IDs such that you could have something like:
type: node
name: n[1-50]
count: 50
- type: core
...
I'm not sure we still want to do that, but it's an option. Otherwise, for machine generated R, it could just be explicitly laid out with counts of only 1.
If I'm following correctly we have an edges:
key which specifies a list of edges that are currently either in:
or out:
, and each of these keys are in turn lists of vertexes connected by these types of edges.
However, there may be multiple types of in
or out
edges.
Maybe new edge types should be specified in an outer dictionary, along with each type's attributes? this is probably better and less verbose than using defaults
:
properties:
edge:
with: { "attrs": { "subsystem": "containment", "out": "contains", "in": "in" }}
foo: { "attrs": ... } # set attributes for edge type "foo"
Or something similar.
Would this work? The drawback is that properties can't be overridden within the spec, like if we had separate attr
and vertices
keys...
In another thread we had other ways to specify edges, this is a version I had for a relatively dense way to hand-write edges for example:
type: node
<power:
- type: pdu
>with: core
<with: rack
Where the prefix characters on the key represented either
type: Core
count: 1
tasks: 1 # defaults to one, meaning one of these rspecs per task, to get one total, use *
sharing: exclusive
contains: []
links:
- type: uses
direction: out
target: 17
We discussed this concept and expressing it at length in that issue.
Thanks for commenting @trws! I freely admit I've completely lost context on what we've discussed before.
Happy to, sorry it took so long actually, OpenMP F2F meeting this week. I like the idea of doing "edges: type : ..." by the way, that's a nice way to handle multiple types without all the extra verbosity of having a "type: ..." key on every one.
We shouldn't bend over backwards to make the resources section from jobspec fit our R though. If it would be better to first emit vertices, then edges in a separate object, perhaps we should allow for that.
Yes. We almost have to allow this. It just that I also wanted to make the similar structure of jobspec's resource section also a valid R format.
Throwing one crazy idea out there: it seems a generic graph format "standard" for JSON is also emerging (like graphml being a small subset of xml to specify a graph), and one possibility is to latch onto that format instead of reinventing our own... http://jsongraphformat.info
I actually used one of those as the serialization format for the test implementation of the original python version of jobspec. As long as the graph format supports directed multigraphs it would be perfectly reasonable to use it. Asking users to write it is another issue, but using it for passing graphs around would be fine, and could even store “canonicalized” jobspec for that matter.
On 29 Jan 2019, at 13:38, Dong H. Ahn wrote:
Also https://github.com/jsongraph/json-graph-specification
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/flux-framework/rfc/issues/109#issuecomment-458717225, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAoStfJ_sf8jqkuRxC7NoqWuHAZ26Pfkks5vIL9bgaJpZM4PsfPr.
I actually used one of those as the serialization format for the test implementation of the original python version of jobspec. As long as the graph format supports directed multigraphs it would be perfectly reasonable to use it.
Oh. Good. I am looking at this, and it seems to have all the constructs I need. I will play with it some more then.
Asking users to write it is another issue
Right. This isn’t so human friendly. The current resource section of jobspec would be far better for that purpose. I don’t know if our requirements include R to be generated directly by human users.
but using it for passing graphs around would be fine, and could even store “canonicalized” jobspec for that matter.
Yes for the future when the jobspec support a full graph, this could be useful as well.
OK. It looks like JSON Graph Format (JGF) gives me everything that I need to emit the resource graph. See below an example in JGF to encode the graph in https://github.com/flux-framework/rfc/issues/109#issuecomment-458651472
Pros:
Cons:
{
"graph":{
"nodes":[
{
"id":"0",
"metadata":{
"type":"rack",
"name":"rack0",
"id":0
}
},
{
"id":"1",
"metadata":{
"type":"node",
"name":"node7",
"id":7
}
},
{
"id":"2",
"metadata":{
"type":"socket",
"name":"socket0",
"id":0
}
},
{
"id":"3",
"metadata":{
"type":"core",
"name":"core0",
"id":0
}
},
{
"id":"4",
"metadata":{
"type":"memory",
"name":"memory0",
"count":2,
"unit":1073741824,
"id":0
}
},
{
"id":"5",
"metadata":{
"type":"foo",
"name":"foo0",
"id":0
}
}
],
"edges":[
{
"source":"0",
"target":"1",
"metadata":{
"subsystem":"containment",
"relationship":"contains"
}
},
{
"source":"1",
"target":"2",
"metadata":{
"subsystem":"containment",
"relationship":"contains"
}
},
{
"source":"2",
"target":"3",
"metadata":{
"subsystem":"containment",
"relationship":"contains"
}
},
{
"source":"2",
"target":"4",
"metadata":{
"subsystem":"containment",
"relationship":"contains"
}
},
{
"source":"2",
"target":"5",
"metadata":{
"subsystem":"foo",
"relationship":"bars"
}
}
]
}
}
A bit verbose (I will look at the spec to see if I can create a default attributes so that we can optimize. )
From the current spec, it wasn't clear if JGF has support for adding default node/edge properties. But since this is JSON, we can always add those properties as an extra data (building on @grondo's idea at https://github.com/flux-framework/rfc/issues/109#issuecomment-458664871).
{
"properties": {
"edges": [
{
"id": "default",
"subsystem": "containment",
"relationship": "contains"
},
{
"id": "foo",
"subsystem": "foo",
"relationship": "bars"
}
]
},
"graph": {
"nodes": [
{
"id": "0",
"metadata": {
"type": "rack",
"name": "rack0",
"id": 0
}
},
{
"id": "1",
"metadata": {
"type": "node",
"name": "node7",
"id": 7
}
},
{
"id": "2",
"metadata": {
"type": "socket",
"name": "socket0",
"id": 0
}
},
{
"id": "3",
"metadata": {
"type": "core",
"name": "core0",
"id": 0
}
},
{
"id": "4",
"metadata": {
"type": "memory",
"name": "memory0",
"count": 2,
"unit": 1073741824,
"id": 0
}
},
{
"id": "5",
"metadata": {
"type": "foo",
"name": "foo0",
"id": 0
}
}
],
"edges": [
{
"source": "0",
"target": "1"
},
{
"source": "1",
"target": "2"
},
{
"source": "2",
"target": "3"
},
{
"source": "2",
"target": "4"
},
{
"source": "2",
"target": "5",
"metadata": {
"property": "foo"
}
}
]
}
}
@grondo and @trws: thoughts?
A bit verbose
Also I found that JGF is much more condense and legible than GraphML (https://github.com/flux-framework/rfc/issues/109#issuecomment-335958389), though!
It looks like JSON Graph Format (JGF) gives me everything that I need to emit the resource graph.
If JGF can become a standard we should have a plenty of tools we can use -- e.g., validators, reader/writers, visualizers (already things like jsongraph.py exists) and editors for debugging and human interactivity);
:+1: From their website, it looks like they are also using json-schema for validation. So +1 for no added dependencies to read and another +1 for no added dependencies to validate.
Right. This isn’t so human friendly. The current resource section of jobspec would be far better for that purpose. I don’t know if our requirements include R to be generated directly by human users.
I agree with you that I don't expect users to have to write R, but they will most likely have to read it. I imagine myself frequently dumping R from the KVS to see what resources my job ran on. That being said, the examples you posted are quite legible IMO.
A bit verbose
Just a thought, but this format probably compresses really well. Lots of repeated tags and patterns. It wouldn't help human-readability, but if we pass the compressed version around in the messages, it can potentially help performance. We could also store it compressed in the KVS, but if we do that, we'll want to make it simple for users to decompress and view R from the KVS.
Also I found that JGF is much more condense and legible than GraphML (#109 (comment)), though!
:+1: :+1: I agree. Much nicer to look at than GraphML.
Cons: A bit verbose (I will look at the spec to see if I can create a default attributes so that we can optimize. ) Lose the human-friendly structure as jobspec's resource section has
Can we propose that the base R
type still contains a hardware topology ("containment" hierarchy) as resource:
or some other key, with a simple, more human readable hierarchy-only representation, while flux-sched and other advanced schedulers can extend the base R with a "graph:"
(and other) sections which can be considered opaque to flux-core components?
Really, what is required from flux-core is the basic containment hierarchy, and an ability to map R
to ranks and construct an R_local
(exec system), and map task slots to local resources (job shell). We could also offer simple tools that parse and display the R
for a job (e.g. in queue listing we might just need to pull out a host list, or in a more detailed listing a user-friendly representation of R
), or users could dump just the resource:
section of R
directly.
Not that I am fully opposed to specifying the format of R as JGF, but at this point it seems like overkill for flux-core components. But I'd hate to add extra complexity elsewhere for only a modicum of simplicity in flux-core.
Can we propose that the base R type still contains a hardware topology ("containment" hierarchy) as resource: or some other key, with a simple, more human readable hierarchy-only representation, while flux-sched and other advanced schedulers can extend the base R with a "graph:" (and other) sections which can be considered opaque to flux-core components?
I think a two section approach has several advantages. For example, this way, we don't have to include rank and slot info into the "graph" section needed for the nested schedulers.
One disadvantage is, though, R will be on the order of twice as big with the two section approach. That may be okay... Generally, the two section approach would be a bit faster in exchange for more needed space.
One alternative appoarch would be to develop a converter layer that converts the graph into the containment as what you require for the execution service.
With either approach, a question remains what should be the exact format for the resource
section. Like we discuss above, at least O would avoid use of with
as the first cut.
How about:
https://github.com/flux-framework/rfc/issues/109#issuecomment-458651472 or similar under the resource
key and JGF under the sched
or graph
key?
For our current sprint, resource
can have R_lite++
of course?
BTW, does the execution system need info on the higher level resources? Like rack or cluster? Or just node and down?
@grondo and @SteVwonder: I can start to draft an RFC on a two section proposal as a way to push forward this dialogue further, if you like.
That would be a great start! Thanks
On Mon, Feb 4, 2019, 4:37 PM Dong H. Ahn <notifications@github.com wrote:
@grondo https://github.com/grondo and @SteVwonder https://github.com/SteVwonder: I can start to draft an RFC on a two section proposal as a way to push forward this dialogue further, if you like.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/flux-framework/rfc/issues/109#issuecomment-460469688, or mute the thread https://github.com/notifications/unsubscribe-auth/AAtSUrGNCGtnSQkQkbEamgJEaaiZcReXks5vKNI6gaJpZM4PsfPr .
I realize I'm coming to this a bit late, but I'd put in my 2c for keeping it one piece. The hardware topology should always be walkable by just limiting the graph representation to that kind of edge, and if it's in two formats in two parts at least some of the components will have to work with both. The job shell has to take R emitted from sched for example, so either sched needs to work with both formats or there has to be one that works for both. Perhaps it would be good to come up with an API or similar for what the core side wants to talk to, that could understand whatever the format is underneath and provide the appropriate information rather than expecting it to walk the resource spec directly?
@trws:
Thank you for your thoughts.
I think what @grondo wants is to put what's required by the execution service in one section in an easy-to-use format and the full information pertaining to the scheduler in the second section. This isn't too difficult to do by the scheduler. I hope that we don't find a situation where any component needs to read both sections. Clearly this isn't optimal in terms of storage and R producer performance. But it has a consumer performance advantage (the execution system only needs to read one section while the nested scheduler instance doesn't needs to read the data like "rank" and "slots") and the lower complexity in the execution system software. The API approach (or converter approach as I suggested above) would be another excellent way to overcome this issue. But what @grondo seems to concern about is the complexity of designing API at this point, as it will have to be a graph code.
Fair enough. I'm all for using the format that makes sense, just pointing out that just because it's stored as a graph (or as a tree) doesn't mean we have to access it that way in terms of the API. The resource spec format is a graph jammed into a tree format, so is JGF, so there are other options if for some reason it turns out to be complicated to get a simple serialization format.
From: Dong H. Ahn notifications@github.com Sent: Wednesday, February 6, 2019 2:26 PM To: flux-framework/rfc Cc: Scogland, Tom; Mention Subject: Re: [flux-framework/rfc] Need specification for "resource set", R (#109)
@trwshttps://github.com/trws:
Thank you for your thoughts.
I think what @grondohttps://github.com/grondo wants is to put what's required by the execution service in one section in an easy-to-use format and the full information pertaining to the scheduler in the second section. This isn't too difficult to do by the scheduler. I hope that we don't find a situation where any component needs to read both sections. Clearly this isn't optimal in terms of storage and R producer performance. But it has a consumer performance advantage (the execution system only needs to read one section while the nested scheduler instance doesn't needs to read the data like "rank" and "slots") and the lower complexity in the execution system software. The API approach (or converter approach as I suggested above) would be another excellent way to overcome this issue. But what @grondohttps://github.com/grondo seems to concern about is the complexity of designing API at this point, as it will have to be a graph code.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/flux-framework/rfc/issues/109#issuecomment-461212860, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAoStS6u9gHiRs2T1zBTgeDkTkZLxG4lks5vK1aGgaJpZM4PsfPr.
Completely agreed! Good thoughts @trws.
Thank you for the good discussions. We will probably want to refer back to this when we evolve R
later.
But for now PR https://github.com/flux-framework/rfc/pull/155 resolved the ticket.
This issue is being opened to start a discussion on the use cases, API, and/or specification for the R as in RFC 15. R is the serialized version of any resource set, and is presumably produced by the serializer described in RFC4, consumed by the resource service in an instance as configuration, and used by the IMP and job shell to determine shape of containment and local resource slots.
In essence, the R format will be the way composite resource and resource configuration information will be transmitted to and from instances of Flux.
Ideally, the purpose of this issue is to determine the format of R such that a new RFC could be drafted.
To get the discussion started, here are some high level requirements and use cases for R:
R should act as resource configuration input to an instance, therefore it may be that configuration of even the system instance is written in R spec, or the configuration language (RDL?) generates R. (in fact, one use case might be to directly generate R from hwloc data)
Execution service in an instance needs to be able to generate Rlocal from R for each rank. So given a rank or even generic "resource vertex", there should be a function to generate an Rn from R, where Rn is a hierarchical subset of R.
The containment plugins in the IMP will need to query Rlocal for the list of local resources of given type or types on which the containment plugins operate. For instance, a memory plugin will need to determine the amount and location of RAM contained in Rlocal in order to set up memcg limits. Similarly a Socket/CPU plugin would need to iterate over or query the list of local sockets/cores in Rlocal to add these to the cgroup.
The job shell will use jobspec+R to determine the local 'task slots' that map to commands in the 'tasks' section.
Dependency management here might get challenging. The IMP is a user of Rn, but we want to ideally eliminate dependencies in the flux-security project on other flux-framework projects. Possible approaches here might include: