Need specification for "resource set", R

grondo commented 6 years ago

This issue is being opened to start a discussion on the use cases, API, and/or specification for the R as in RFC 15. R is the serialized version of any resource set, and is presumably produced by the serializer described in RFC4, consumed by the resource service in an instance as configuration, and used by the IMP and job shell to determine shape of containment and local resource slots.

In essence, the R format will be the way composite resource and resource configuration information will be transmitted to and from instances of Flux.

Ideally, the purpose of this issue is to determine the format of R such that a new RFC could be drafted.

To get the discussion started, here are some high level requirements and use cases for R:

R should act as resource configuration input to an instance, therefore it may be that configuration of even the system instance is written in R spec, or the configuration language (RDL?) generates R. (in fact, one use case might be to directly generate R from hwloc data)
Execution service in an instance needs to be able to generate R_local from R for each rank. So given a rank or even generic "resource vertex", there should be a function to generate an R_n from R, where R_n is a hierarchical subset of R.
The containment plugins in the IMP will need to query R_local for the list of local resources of given type or types on which the containment plugins operate. For instance, a memory plugin will need to determine the amount and location of RAM contained in R_local in order to set up memcg limits. Similarly a Socket/CPU plugin would need to iterate over or query the list of local sockets/cores in R_local to add these to the cgroup.
The job shell will use jobspec+R to determine the local 'task slots' that map to commands in the 'tasks' section.

Dependency management here might get challenging. The IMP is a user of R_n, but we want to ideally eliminate dependencies in the flux-security project on other flux-framework projects. Possible approaches here might include:

The IMP could take a subset of the R specification, simple enough to parse with its own parser and offer some simplified interface to IMP plugins that do containment. Containment plugins probably only really need to get a list of local resources from R, as long as the logical IDs match the actual system logical IDs. The IMP's interface to R could later be expanded to offer higher level functionality for advanced containment (though I don't have any use cases that I can think of here)
Alternately, the IMP itself could treat R as opaque data, passed to plugins. The plugins would then have a dependency on some library from a system-installed flux core or sched project.

dongahn commented 6 years ago

Thanks, good points. I don't want to stall your progress with my minor questions. However it has been useful for my understanding, so thanks!

It has also been helpful to flesh out more details as well. Thanks!

It might help me if we could work through how some simple use cases might work using GraphML from above... e.g. find the correct layout of tasks across R for simple task slot shapes (e.g. 1 core, 1 socket, 1 socket, 1 core, etc).

Yeah, I will look more deeply into slot HOWTOs; so I will keep you posted. I had struggled with this under the ECP milestone due coming and punted it by not considering it for that work. So, I need to go back and spend some quality time to work on that logic. I think the logic of finding slots from R is its own topic, though. Regardless of what R spec we use, finding slots from R requires its own effort IMHO.

dongahn commented 6 years ago

I believe the results of this discussion is:

GraphML may be suited for this and R specification in and of itself would be an easier part.
Need experience on slot support both for the resource matching logic (scheduler) and for tasks layout logic (at the job shell level).
Indexing for accelerating finding node-local resources could just be an internal logic of R libraries.

We agreed that this should be the area of near-term investigation. For now, I call this as "slot support for task layout and resource grouping." This is a co-investigation between execution and scheduling services.

dongahn commented 6 years ago

Need experience on slot support both for the resource matching logic (scheduler)

FYI -- I have been prototyping slot support using my resource-query and at this point I believe this is doable -- essentially first detecting the corresponding resource level for slotting and calling into a special slot-aware subtree walk.

SteVwonder commented 5 years ago

Notes from our September meeting about the format for R: https://github.com/flux-framework/rfc/wiki/Brainstorming-on-R

dongahn commented 5 years ago

@grondo and @SteVwonder: As we discussed, I will post some R representations to push forward our R spec discussion. At this point, it should be pretty easy for our resource infrastructure to emit a fully concretized R with various formats. So we can play with various formats.

The first obvious idea is to use the resource section format of the jobspec:

Given a tiny machine (called tiny0) and the following jobspec: tiny

version: 1
resources:
    - type: cluster
      count: 1
      with:
        - type: rack
          count: 1
          with:
            - type: node
              count: 1
              with:
                  - type: slot
                    count: 1
                    label: default
                    with:
                      - type: socket
                        count: 1
                        with:
                          - type: core
                            count: 1
                          - type: memory
                            count: 4

The resource infrastructure can generate:

      - type: cluster
        count: 1
        name: tiny0
        id: 0
        with:
         - type: rack
           count: 1
           name: rack0
           id: 0
           with:
            - type: node
              count: 1
              name: node0
              id: 0
              with:
               - type: socket
                 count: 1
                 name: socket0
                 id: 0
                 exclusive: true
                 with:
                  - type: memory
                    count: 2
                    name: memory1
                    id: 1
                    exclusive: true
                  - type: memory
                    count: 2
                    name: memory0
                    id: 0
                    exclusive: true
                  - type: core
                    count: 1
                    name: core0
                    id: 0
                    exclusive: true

A few Issues:

with implies a multiplicative edge, and I found that in a fully concretized resource set this semantics can be confusing. What happens if an intermediate vertex's count is multiple? This can lead to an incorrect interpretation of the size of a target vertex. I think introducing the edge key whose semantic simply is associative would be much cleaner.
With the current format, I can't add information that the resource infrastructure uses to manage the graph complexity. For example, the infrastructure allows you to add what subsystem(s)(e.g., containment, network connective and power) a vertex or an edge belongs to. But as is, I can't pass that information and this can affect nested instances. A question is, can we add this optional data into R in a way that doesn't increase the overhead of the remote execution system's R parsing much.
slot and rank are missing. As we discussed before, in a fully concretized resource set, it seems better to encode that information as attributes of their resource vertices. There is an opposite problem in this case. Can we add this info such that it won't affect the overhead of the scheduler's R parsing much.

Next steps: Once I get your first feedback, I can show some examples of having slot and rank added to R using resource matching module with hwloc reader.

Let's also discuss what are the implications of adding edge key and also optional info key.

dongahn commented 5 years ago

I just saw from a next step from @SteVwonder's excellent note:

First implement the emitter in resource matching service Emit simple examples as JSON Then work on a reader

I think the JSON representation of the above R looks like:

{
    "type": "cluster",
    "count": 1,
    "name": "tiny0",
    "id": 0,
    "with": {
        "type": "rack",
        "count": 1,
        "name": "rack0",
        "id": 0,
        "with": {
            "type": "node",
            "count": 1,
            "name": "node0",
            "id": 0,
            "with": {
                "type": "socket",
                "count": 1,
                "name": "socket0",
                "id": 0,
                "exclusive": true,
                "with": {
                    "type": "socket",
                    "count": 1,
                    "name": "socket0",
                    "id": 0,
                    "exclusive": true,
                    "with": [{
                            "type": "core",
                            "count": 1,
                            "name": "core0",
                            "id": 0,
                            "exclusive": true
                        },
                        {
                            "type": "memory",
                            "count": 2,
                            "name": "memory1",
                            "id": 1,
                            "exclusive": true
                        },
                        {
                            "type": "memory",
                            "count": 2,
                            "name": "memory0",
                            "id": 0,
                            "exclusive": true
                        }
                    ]
                }
            }
        }
    }
}

grondo commented 5 years ago

@dongahn, thanks for pushing this forward. Here are some thoughts after our quick discussion yesterday. I don't claim to have any answers or great insight, but I want to keep the discussion moving forward.

Emitting R on JSON is perfect for now. I'm not sure I see a case where yaml would be preferred for R, except in the case of hand edited "resource configuration" input -- however, I assume we would rather offer tools or a higher level DSL to do that anyway.
- We should plan to use our existing R_lite format for now, meanwhile extending the format step-wise with new versions as requirements evolve. To that end, perhaps we want to tweak R_lite to include version and or version_name fields?

Some thoughts on your "issues" above:

with implies a multiplicative edge, and I found that in a fully concretized resource set this semantics can be confusing. What happens if an intermediate vertex's count is multiple? This can lead to an incorrect interpretation of the size of a target vertex. I think introducing the edge key whose semantic simply is associative would be much cleaner.

I trust that you've thought this through better than I, but I don't see how the multiplicative with edge has confusing semantics. In a fully concretized resource set, you can still emit with: links where count is 1 and not have the multiplicative property, however allowing count > 1 for identical resource types in the set would allow for much smaller emitted R.

I agree, however, the right apporach might be to start with edge: as you suggest, then add special case links like with: that are handled in very specific ways. For example, in a fully "concretized" resource set, no resources are actually identical, because we are assigning or describing distinct resources. So a multiplicative edge like with: would need extra information about how to expand the count > 1 items into count distinct items. E.g. you might have a idset: instead of ids, and for example, 4 cores on a socket might become:

{ "type": "socket",
   "id": 0,
   "exclusive": false,
   "with": [
      { "type": "core",
         "count": 4,
         "idset": "0-3",
         "exclusive": true
      }
   ]
}

A parser of this version of R would know to expand this with: directive into:

{ "type": "socket",
   "id": 0,
   "exclusive": false,
   "with": [
      { "type": "core",
         "name": "core0",
         "count": 1,
         "id": 0,
         "exclusive": true
      },
      { "type": "core",
         "name": "core1",
         "count": 1,
         "id": 1,
         "exclusive": true
      },
      { "type": "core",
         "name": "core2",
         "count": 1,
         "id": 2,
         "exclusive": true
      },
      { "type": "core",
         "name": "core3",
         "count": 1,
         "id": 3,
         "exclusive": true
      },
   ]
}

or whatever the equivalent with edge: would be. This is similar to what the original RDL experiment did, and it allowed for very compact representations of thousands of homogeneous nodes. I'm not saying we need this support now, but it would perhaps be a point of future evolution for the R format.

A question is, can we add this optional data into R in a way that doesn't increase the overhead of the remote execution system's R parsing much.

What we want is a way to add annotation information to R, preferably I think (mostly) outside of the main JSON for R.

We have already talked about allowing generic attributes for resources in R as annotation outside of normal R use cases. Initially, what if advanced schedulers used this information to reference resources outside of the main R hieararchy. Then the scheduler could embed their extra topology and graph information in one or more completely separate sections of R, which would be considered opaque to anything but that schedulers components.

This would mean that the top level R becomes an object, and the R resource format itself would be stored in a well known key within this object (say resources:). Any other emitter of R could embed a new key with extra information which would be ignored by most parsers, e.g.

{
  "resources": [
    {
      "type": "cluster",
      "name": "test",
      "id": 0,
      "with": [
        {
          "type": "node",
          "name": "node",
          "id": 112,
          "with": [
            {
              "type": "socket",
              "name": "socket",
              "id": 0,
              "with": [
                {
                  "type": "core",
                  "name": "core",
                  "id": 1
                }
              ]
            }
          ]
        }
      ]
    }
  ],
  "sched": {
    "grug": "xml string..."
  }
}

The sched.grug xml could possibly reference resources from resources array either by first embedding uuids into each resource, e.g. "attributes": { "uuid": ... } and reference unique resources that way, or resources could be back referenced some other way, e.g. perhaps by the resource uri something like cluster0/node112/socket0/core1. There are probably lots of other solutions as well.

slot and rank are missing.

This one requires more thought. slot and rank would be used directly by flux-core services like the execution system and job shell, so they should be encoded as first-class members of R. We'd have to think through if these items are encoded better in the resources section directly or if it might be easier in some separate section of R. I don't have any good ideas here.

dongahn commented 5 years ago

@grondo: Sorry for the late response. Let me try to reason about this one by one.

I trust that you've thought this through better than I, but I don't see how the multiplicative with edge has confusing semantics. In a fully concretized resource set, you can still emit with: links where count is 1

Maybe I'm overthinking this, but I think this can be confusing if we allow a resource pool vertex with count > 1 to be an intermediate vertex. Say you want to represent 2 compute nodes as a resource pool under which you have 76 cores. A graph can certainly model this: One vertex with two compute nodes aggregated as a pool; Then, from this vertex you have 76 out-edges to core vertices.

Now, if your jobspec is,

version: 1
resources:
    - type: cluster
      count: 1
      with:
        - type: rack
          count: 1
          with:
            - type: slot
              count: 1
              label: default
              with:
                - type: node
                  count: 2
                  with:
                    - type: core
                      count: 1

And if you emit vertices and edges in a most simplistic way, you would get

      - type: cluster
        count: 1
        name: tiny0
        id: 0
        with:
         - type: rack
           count: 1
           name: rack0
           id: 0
           with:
            - type: node
              count: 2
              name: node1
              id: 1
              exclusive: true
              with:
               - type: core
                 count: 1
                 name: core71
                 id: 71
                 exclusive: true
               - type: core
                 count: 1
                 name: core70
                 id: 70
                 exclusive: true

But given with being multiplicative, I think this can become ambiguous to interpret. In this case, With should used interpreted as associative, but that is the semantics of edge. We can certainly mandate with in the concretized graph shall be interpreted as associative. But then in preparation for when we need to support compression, there are benefits to maintain the original multiplicative semantics of the with key.

dongahn commented 5 years ago

In a fully concretized resource set, you can still emit with: links where count is 1 and not have the multiplicative property

Yes, I agree. In this case, since multiply-by-1 is the same as being associative, this should be okay. It is just that a full concretization is only possible when you had no coarsening in the scheduler's graph data, which in generally cannot be assumed. We can require it in the resource data model of Flux but I am unclearly it is a good idea. The high-end systems in a distant future are headed towards the concept of "aggregated resources" where they no longer have the concept of "real" compute nodes...

dongahn commented 5 years ago

I agree, however, the right approach might be to start with edge: as you suggest, then add special case links like with: that are handled in very specific ways.

I completely agree with you! I will propose the edge key somewhere so that I can use edge for the initial R. I think @trws once proposed this as part of the canonical jobspec, so I can look at the past commits to retrieve and review it. Seems we agreed that we want to keep the multiplication property of with and use it as a way to condense R later on.

dongahn commented 5 years ago

A question is, can we add this optional data into R in a way that doesn't increase the overhead of the remote execution system's R parsing much. What we want is a way to add annotation information to R, preferably I think (mostly) outside of the main JSON for R.

This sounds reasonable. One thing is, though, these extra data are currently not that much of information so not sure if there will be much benefit to have a separate section for them at least at this point.

Why don't I generate a few examples where those scheduler specific data are directly emitted into each vertex and edge as their "attributes" and further our discussions?

dongahn commented 5 years ago

slot and rank are missing.

This one requires more thought. slot and rank would be used directly by flux-core services like the execution system and job shell, so they should be encoded as first-class members of R. We'd have to think through if these items are encoded better in the resources section directly or if it might be easier in some separate section of R. I don't have any good ideas here.

Again some examples seem to help further our discussions. In those future examples, let me emit all of them into first-class members of R.

dongahn commented 5 years ago

Seems the next steps should be:

Propose the edge key
Have a load option to the resource matching service to generate R_lite. This is needed for the initial support for our new remote execution service although some extension will be required;
As part of that, add other emitter types: JSON and YAML;
Examples with slot, rank and scheduler data directly embedded in R for further evaluation

dongahn commented 5 years ago

@grondo: We didn't have whole lot on the edge key back then. A commit only has it:

*edge*::    
    **XXX**: need specification for other "edge match descriptors"

My initial thought: perhaps we can define the edge key as:

"dflt_edge_attr": { “subsystem”: string, “relationship”: string }

"edge":
    ?"attr": { “subsystem”: string, “relationship”: string }
    ?"in": ( $vertex_label )
    "out": ( $vertex_label | $resource_vertex )

edge: The edge key SHALL indicate an edge from a resource vertex to another resource.

If the in key is present within it, this SHALL be an edge from the resource vertex referred to by its vertex label to the resource vertex captured by the out key. If in is omitted, the resource vertex containing the edge key is assumed to be the in vertex.

The out key SHALL either refer to the destination vertex with a vertex label or be a list conforming to the resource vertex specification. For latter, the each resource vertex appears in this list is assumed to have an edge of the same type from the in vertex.

If attr is present, it will describe the subsystem to which this edge belongs and the relationship between the two connecting resource vertices. If omitted, it will inherit the default edge attributes.

With something like this, R in a typical case can look like:

dflt_edge_attr: { subsystem: containment, relationship: contains }
resource:
  - type: rack
    count: 1
    id: 0
    edge:
      out:
        - type: node
          name: node7
          id: 7
          count: 1
          edge:
            out:
              - type: socket
                count: 1
                edge:
                  out:
                    - type: core
                      name: core0
                      id: 0
                    - type: memory
                      name: memory0
                      count: 4
                      unit: GB

dongahn commented 5 years ago

For latter, the each resource vertex appears in this list is assumed to have an edge of the same type from the in vertex.

BTW, if a resource vertex has out edges of different types to other resources, we will have to end up emitting the edge key multiple times from the same in vertex. Are duplicate keys allowed in both YAML and JSON?

grondo commented 5 years ago

Are duplicate keys allowed in both YAML and JSON?

I believe keys in both YAML and JSON have to be unique.

we will have to end up emitting the edge key multiple times from the same in vertex.

I'm not sure I understand. in: and out: are valid keys for edge: (seems like it could better be called edges:), and both are lists, then you can specify any number of in or out edges for a given resource vertex?

We shouldn't bend over backwards to make the resources section from jobspec fit our R though. If it would be better to first emit vertices, then edges in a separate object, perhaps we should allow for that.

grondo commented 5 years ago

"dflt_edge_attr": { “subsystem”: string, “relationship”: string }

I suggest we don't add top-level keys in R like this. For readabillity, extensibility (and a bit of sanity), I'd suggest something namespaced, like

defaults:
  edge:
     attrs: { "subsystem": "containment", "relationship": "contains" }

dongahn commented 5 years ago

I'm not sure I understand. in: and out: are valid keys for edge: (seems like it could better be called edges:), and both are lists, then you can specify any number of in or out edges for a given resource vertex?

Great idea. Let me play with it. Love the idea of edges plural.

We shouldn't bend over backwards to make the resources section from jobspec fit our R though. If it would be better to first emit vertices, then edges in a separate object, perhaps we should allow for that.

Yes. We almost have to allow this. It just that I also wanted to make the similar structure of jobspec's resource section also a valid R format.

dongahn commented 5 years ago

I suggest we don't add top-level keys in R like this. For readabillity, extensibility (and a bit of sanity), I'd suggest something namespaced, like

Yup! I was thinking along the same line.

dongahn commented 5 years ago

Great idea. Let me play with it. Love the idea of edges plural.

defaults:
    edge:
        attrs: { subsystem: containment, out: contains, in: in }

resource:
  - type: rack
    count: 1
    id: 0
    edges:
      - out:
        - type: node
          name: node7
          id: 7
          count: 1
          edges:
            - out:
              - type: socket
                count: 1
                edges:
                  - out:
                      - type: core
                        name: core0
                        count: 1
                        id: 0
                      - type: memory
                        name: memory0
                        count: 4
                        unit: GB
                  # how to annotate different out-edge type
                  - out:
                      type: foo
                      name: foo1
                      count: 1

I like this direction. But I am not clear what is the best way to annotate an out edge when it has a different attribute then the default. Now I remember I used the singular edge key because of this. @grondo: any idea?

dongahn commented 5 years ago

edges:
    - out:
        ? attrs: {}
        vtx: $resource_vertex

We can also do it this way at the expense of being verbose...of course.

trws commented 5 years ago

with implies a multiplicative edge, and I found that in a fully concretized resource set this semantics can be confusing. What happens if an intermediate vertex's count is multiple? This can lead to an incorrect interpretation of the size of a target vertex. I think introducing the edge key whose semantic simply is associative would be much cleaner.

Note that, at least early on in here, that was meant to be dealt with by range expansion on names and or IDs such that you could have something like:

type: node
name: n[1-50]
count: 50
  - type: core
...

I'm not sure we still want to do that, but it's an option. Otherwise, for machine generated R, it could just be explicitly laid out with counts of only 1.

grondo commented 5 years ago

If I'm following correctly we have an edges: key which specifies a list of edges that are currently either in: or out:, and each of these keys are in turn lists of vertexes connected by these types of edges.

However, there may be multiple types of in or out edges.

Maybe new edge types should be specified in an outer dictionary, along with each type's attributes? this is probably better and less verbose than using defaults:

properties:
   edge:
      with: { "attrs": { "subsystem": "containment", "out": "contains", "in": "in" }}
      foo: { "attrs": ... } # set attributes for edge type "foo"

Or something similar.

Would this work? The drawback is that properties can't be overridden within the spec, like if we had separate attr and vertices keys...

trws commented 5 years ago

In another thread we had other ways to specify edges, this is a version I had for a relatively dense way to hand-write edges for example:

type: node
<power:
  - type: pdu
>with: core
<with: rack

Where the prefix characters on the key represented either out. That's rather... opaque I would say, but it's OK as shorthand. If the goal is to do multiple edge types under a single edges key, what we were talking about in 2015 has an example here with description, copied below, called links back then:

    type: Core
    count: 1
    tasks: 1 # defaults to one, meaning one of these rspecs per task, to get one total, use *
    sharing: exclusive
    contains: []
    links:
      - type: uses
        direction: out
        target: 17

We discussed this concept and expressing it at length in that issue.

grondo commented 5 years ago

Thanks for commenting @trws! I freely admit I've completely lost context on what we've discussed before.

trws commented 5 years ago

Happy to, sorry it took so long actually, OpenMP F2F meeting this week. I like the idea of doing "edges: type : ..." by the way, that's a nice way to handle multiple types without all the extra verbosity of having a "type: ..." key on every one.

dongahn commented 5 years ago

We shouldn't bend over backwards to make the resources section from jobspec fit our R though. If it would be better to first emit vertices, then edges in a separate object, perhaps we should allow for that.

Yes. We almost have to allow this. It just that I also wanted to make the similar structure of jobspec's resource section also a valid R format.

Throwing one crazy idea out there: it seems a generic graph format "standard" for JSON is also emerging (like graphml being a small subset of xml to specify a graph), and one possibility is to latch onto that format instead of reinventing our own... http://jsongraphformat.info

dongahn commented 5 years ago

Also https://github.com/jsongraph/json-graph-specification

trws commented 5 years ago

I actually used one of those as the serialization format for the test implementation of the original python version of jobspec. As long as the graph format supports directed multigraphs it would be perfectly reasonable to use it. Asking users to write it is another issue, but using it for passing graphs around would be fine, and could even store “canonicalized” jobspec for that matter.

On 29 Jan 2019, at 13:38, Dong H. Ahn wrote:

Also https://github.com/jsongraph/json-graph-specification

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/flux-framework/rfc/issues/109#issuecomment-458717225, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAoStfJ_sf8jqkuRxC7NoqWuHAZ26Pfkks5vIL9bgaJpZM4PsfPr.

dongahn commented 5 years ago

I actually used one of those as the serialization format for the test implementation of the original python version of jobspec. As long as the graph format supports directed multigraphs it would be perfectly reasonable to use it.

Oh. Good. I am looking at this, and it seems to have all the constructs I need. I will play with it some more then.

Asking users to write it is another issue

Right. This isn’t so human friendly. The current resource section of jobspec would be far better for that purpose. I don’t know if our requirements include R to be generated directly by human users.

but using it for passing graphs around would be fine, and could even store “canonicalized” jobspec for that matter.

Yes for the future when the jobspec support a full graph, this could be useful as well.

dongahn commented 5 years ago

OK. It looks like JSON Graph Format (JGF) gives me everything that I need to emit the resource graph. See below an example in JGF to encode the graph in https://github.com/flux-framework/rfc/issues/109#issuecomment-458651472

Pros:

If JGF can become a standard we should have a plenty of tools we can use -- e.g., validators, reader/writers, visualizers (already things like jsongraph.py exists) and editors for debugging and human interactivity);
Similarly, building on a standard may allow us to leverage future techniques such as compression which the community may work on (or we can work on and contribute)

Cons:

A bit verbose (I will look at the spec to see if I can create a default attributes so that we can optimize. )
Lose the human-friendly structure as jobspec's resource section has

{  
  "graph":{  
    "nodes":[  
      {  
        "id":"0",
        "metadata":{  
          "type":"rack",
          "name":"rack0",
          "id":0
        }
      },
      {  
        "id":"1",
        "metadata":{  
          "type":"node",
          "name":"node7",
          "id":7
        }
      },
      {  
        "id":"2",
        "metadata":{  
          "type":"socket",
          "name":"socket0",
          "id":0
        }
      },
      {  
        "id":"3",
        "metadata":{  
          "type":"core",
          "name":"core0",
          "id":0
        }
      },
      {  
        "id":"4",
        "metadata":{  
          "type":"memory",
          "name":"memory0",
          "count":2,
          "unit":1073741824,
          "id":0
        }
      },
      {  
        "id":"5",
        "metadata":{  
          "type":"foo",
          "name":"foo0",
          "id":0
        }
      }
    ],
    "edges":[  
      {  
        "source":"0",
        "target":"1",
        "metadata":{  
          "subsystem":"containment",
          "relationship":"contains"
        }
      },
      {  
        "source":"1",
        "target":"2",
        "metadata":{  
          "subsystem":"containment",
          "relationship":"contains"
        }
      },
      {  
        "source":"2",
        "target":"3",
        "metadata":{  
          "subsystem":"containment",
          "relationship":"contains"
        }
      },
      {  
        "source":"2",
        "target":"4",
        "metadata":{  
          "subsystem":"containment",
          "relationship":"contains"
        }
      },
      {  
        "source":"2",
        "target":"5",
        "metadata":{  
          "subsystem":"foo",
          "relationship":"bars"
        }
      }
    ]
  }
}

dongahn commented 5 years ago

A bit verbose (I will look at the spec to see if I can create a default attributes so that we can optimize. )

From the current spec, it wasn't clear if JGF has support for adding default node/edge properties. But since this is JSON, we can always add those properties as an extra data (building on @grondo's idea at https://github.com/flux-framework/rfc/issues/109#issuecomment-458664871).

{
    "properties": {
       "edges": [
           {
               "id": "default",
               "subsystem": "containment",
               "relationship": "contains"
           },
           {
               "id": "foo",
               "subsystem": "foo",
               "relationship": "bars"
           }
       ]
    },
   "graph": {
       "nodes": [
            {
                "id": "0",
                 "metadata": {
                     "type": "rack",
                     "name": "rack0",
                     "id": 0 
                  }
            },
            {
                "id": "1",
                "metadata": {
                     "type": "node",
                     "name": "node7",
                     "id": 7
                  } 
            },
            {
                "id": "2",
                "metadata": {
                     "type": "socket",
                     "name": "socket0",
                     "id": 0
                  }
            },
            {
                "id": "3",
                "metadata": {
                     "type": "core",
                     "name": "core0",
                     "id": 0
                  }
            },
            {
                "id": "4",
                "metadata": {
                     "type": "memory",
                     "name": "memory0",
                     "count": 2,
                     "unit": 1073741824,
                     "id": 0
                  }
            },
            {
                "id": "5", 
                "metadata": {
                     "type": "foo",
                     "name": "foo0",
                     "id": 0
                  }
            }
        ],
        "edges": [
            {
                "source": "0",
                "target": "1"
            },
            {
                "source": "1",
                "target": "2"
            },
            {
                "source": "2",
                "target": "3"
            },
            {
                "source": "2",
                "target": "4"
            },
            {
                "source": "2",
                "target": "5",
                "metadata": {
                    "property": "foo"
                 }
            }
        ]
    }
}

@grondo and @trws: thoughts?

dongahn commented 5 years ago

A bit verbose

Also I found that JGF is much more condense and legible than GraphML (https://github.com/flux-framework/rfc/issues/109#issuecomment-335958389), though!

SteVwonder commented 5 years ago

It looks like JSON Graph Format (JGF) gives me everything that I need to emit the resource graph.

If JGF can become a standard we should have a plenty of tools we can use -- e.g., validators, reader/writers, visualizers (already things like jsongraph.py exists) and editors for debugging and human interactivity);

:+1: From their website, it looks like they are also using json-schema for validation. So +1 for no added dependencies to read and another +1 for no added dependencies to validate.

Right. This isn’t so human friendly. The current resource section of jobspec would be far better for that purpose. I don’t know if our requirements include R to be generated directly by human users.

I agree with you that I don't expect users to have to write R, but they will most likely have to read it. I imagine myself frequently dumping R from the KVS to see what resources my job ran on. That being said, the examples you posted are quite legible IMO.

A bit verbose

Just a thought, but this format probably compresses really well. Lots of repeated tags and patterns. It wouldn't help human-readability, but if we pass the compressed version around in the messages, it can potentially help performance. We could also store it compressed in the KVS, but if we do that, we'll want to make it simple for users to decompress and view R from the KVS.

Also I found that JGF is much more condense and legible than GraphML (#109 (comment)), though!

:+1: :+1: I agree. Much nicer to look at than GraphML.

grondo commented 5 years ago

Cons: A bit verbose (I will look at the spec to see if I can create a default attributes so that we can optimize. ) Lose the human-friendly structure as jobspec's resource section has

Can we propose that the base R type still contains a hardware topology ("containment" hierarchy) as resource: or some other key, with a simple, more human readable hierarchy-only representation, while flux-sched and other advanced schedulers can extend the base R with a "graph:" (and other) sections which can be considered opaque to flux-core components?

Really, what is required from flux-core is the basic containment hierarchy, and an ability to map R to ranks and construct an R_local (exec system), and map task slots to local resources (job shell). We could also offer simple tools that parse and display the R for a job (e.g. in queue listing we might just need to pull out a host list, or in a more detailed listing a user-friendly representation of R), or users could dump just the resource: section of R directly.

Not that I am fully opposed to specifying the format of R as JGF, but at this point it seems like overkill for flux-core components. But I'd hate to add extra complexity elsewhere for only a modicum of simplicity in flux-core.

dongahn commented 5 years ago

Can we propose that the base R type still contains a hardware topology ("containment" hierarchy) as resource: or some other key, with a simple, more human readable hierarchy-only representation, while flux-sched and other advanced schedulers can extend the base R with a "graph:" (and other) sections which can be considered opaque to flux-core components?

I think a two section approach has several advantages. For example, this way, we don't have to include rank and slot info into the "graph" section needed for the nested schedulers.

One disadvantage is, though, R will be on the order of twice as big with the two section approach. That may be okay... Generally, the two section approach would be a bit faster in exchange for more needed space.

One alternative appoarch would be to develop a converter layer that converts the graph into the containment as what you require for the execution service.

With either approach, a question remains what should be the exact format for the resource section. Like we discuss above, at least O would avoid use of with as the first cut.

dongahn commented 5 years ago

How about:

https://github.com/flux-framework/rfc/issues/109#issuecomment-458651472 or similar under the resource key and JGF under the sched or graph key?

For our current sprint, resource can have R_lite++ of course?

dongahn commented 5 years ago

BTW, does the execution system need info on the higher level resources? Like rack or cluster? Or just node and down?

dongahn commented 5 years ago

@grondo and @SteVwonder: I can start to draft an RFC on a two section proposal as a way to push forward this dialogue further, if you like.

grondo commented 5 years ago

That would be a great start! Thanks

On Mon, Feb 4, 2019, 4:37 PM Dong H. Ahn <notifications@github.com wrote:

@grondo https://github.com/grondo and @SteVwonder https://github.com/SteVwonder: I can start to draft an RFC on a two section proposal as a way to push forward this dialogue further, if you like.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/flux-framework/rfc/issues/109#issuecomment-460469688, or mute the thread https://github.com/notifications/unsubscribe-auth/AAtSUrGNCGtnSQkQkbEamgJEaaiZcReXks5vKNI6gaJpZM4PsfPr .

trws commented 5 years ago

I realize I'm coming to this a bit late, but I'd put in my 2c for keeping it one piece. The hardware topology should always be walkable by just limiting the graph representation to that kind of edge, and if it's in two formats in two parts at least some of the components will have to work with both. The job shell has to take R emitted from sched for example, so either sched needs to work with both formats or there has to be one that works for both. Perhaps it would be good to come up with an API or similar for what the core side wants to talk to, that could understand whatever the format is underneath and provide the appropriate information rather than expecting it to walk the resource spec directly?

dongahn commented 5 years ago

@trws:

Thank you for your thoughts.

I think what @grondo wants is to put what's required by the execution service in one section in an easy-to-use format and the full information pertaining to the scheduler in the second section. This isn't too difficult to do by the scheduler. I hope that we don't find a situation where any component needs to read both sections. Clearly this isn't optimal in terms of storage and R producer performance. But it has a consumer performance advantage (the execution system only needs to read one section while the nested scheduler instance doesn't needs to read the data like "rank" and "slots") and the lower complexity in the execution system software. The API approach (or converter approach as I suggested above) would be another excellent way to overcome this issue. But what @grondo seems to concern about is the complexity of designing API at this point, as it will have to be a graph code.

trws commented 5 years ago

Fair enough. I'm all for using the format that makes sense, just pointing out that just because it's stored as a graph (or as a tree) doesn't mean we have to access it that way in terms of the API. The resource spec format is a graph jammed into a tree format, so is JGF, so there are other options if for some reason it turns out to be complicated to get a simple serialization format.

From: Dong H. Ahn notifications@github.com Sent: Wednesday, February 6, 2019 2:26 PM To: flux-framework/rfc Cc: Scogland, Tom; Mention Subject: Re: [flux-framework/rfc] Need specification for "resource set", R (#109)

@trwshttps://github.com/trws:

Thank you for your thoughts.

I think what @grondohttps://github.com/grondo wants is to put what's required by the execution service in one section in an easy-to-use format and the full information pertaining to the scheduler in the second section. This isn't too difficult to do by the scheduler. I hope that we don't find a situation where any component needs to read both sections. Clearly this isn't optimal in terms of storage and R producer performance. But it has a consumer performance advantage (the execution system only needs to read one section while the nested scheduler instance doesn't needs to read the data like "rank" and "slots") and the lower complexity in the execution system software. The API approach (or converter approach as I suggested above) would be another excellent way to overcome this issue. But what @grondohttps://github.com/grondo seems to concern about is the complexity of designing API at this point, as it will have to be a graph code.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/flux-framework/rfc/issues/109#issuecomment-461212860, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAoStS6u9gHiRs2T1zBTgeDkTkZLxG4lks5vK1aGgaJpZM4PsfPr.

dongahn commented 5 years ago

Completely agreed! Good thoughts @trws.

dongahn commented 5 years ago

Thank you for the good discussions. We will probably want to refer back to this when we evolve R later.

But for now PR https://github.com/flux-framework/rfc/pull/155 resolved the ticket.

flux-framework / rfc

Need specification for "resource set", R #109

The first obvious idea is to use the resource section format of the jobspec: