flux-framework / flux-core

core services for the Flux resource management framework
GNU Lesser General Public License v3.0
166 stars 49 forks source link

Design/prototype new run interface #2213

Open trws opened 5 years ago

trws commented 5 years ago

This is a design/discussion issue for the command line arguments and syntax for a jobspec-oriented run/submit interface.

The main idea is that there is a "slot shape", and a target entity for task scheduling. I'm not 100% sold on my own terminology here so please feel free to propose alternatives. Essentially, you get whether the task is per-slot or per-resource and which resource from the --target parameter, which defaults to slot. The number of tasks can either be per-target, or a total count, and the shape is specified with a restricted version of the original short-form jobspec I proposed. Here's a sketch of the interface:

flux run

use-cases, drawn from rfc14:

  1. Request 4 nodes: flux run --nslots 4
  2. Request between 3 and 30 nodes: flux run --nslots 3:30
  3. Request 4 tasks(sic. was nodes, but that would be the same as the following) with at least 2 sockets each, and 4 cores per socket: (not planning to support sockets yet, but) flux run --nslots 4 --shape socket[2]>core[4]
  4. Request an exclusive allocation of 4 nodes that have at least two sockets and 4 cores per socket: flux run --nslots 4 --shape node>socket[2]>core[4]

Skipping the complex examples as we don't plan to support them yet, and for now the recommended mechanism would be writing the jobspec.

use-case set 2:

  1. Run hostname 20 times on 4 nodes, 5 per node
    1. flux run --nslots 4 --total-tasks 20 hostname
    2. flux run --nslots 4 --tasks-per-slot 5 hostname
    3. flux run --slot-shape node[4] --tasks-per-resource node:5 hostname
  2. Run 5 copies of hostname across 4 nodes, default distribution: flux run --nslots 4 --total-tasks 5 hostname
  3. Run 10 copies of myapp, require 2 cores per copy, for a total of 20 cores: flux run --nslots 10 --shape core[2] myapp
  4. Multiple binaries is not necessarily on tap yet, but I'm thinking of allowing you to have multiple of these on the same command line with a separator, probably get to the same place.
  5. Run 10 copies of app across 10 cores with at least 2GB per core: flux run --shape (core,memory[2g]) app (possibly amounts we may need to revisit)
  6. Run 10 copies of app across 2 nodes with at least 4GB per node: flux run --shape node>memory[4g] --total-tasks 10 app

One possible issue here is that several of our use-cases require the slot to be outside the node for them to be easily expressible. Opening another issue for discussion of jobspec-V1 and ordering shortly.

SteVwonder commented 5 years ago

I really like this interface. I think it is very concise yet powerful and flexible.

--time walltime, TODO need a format here, prefer a standard

Flux Standard Duration? Unfortunately not super user friendly if you want to specify a fractional amount like 4 hours and 30 minutes (i.e., 270m in FSD).

--tasks-per-target: number of tasks to run per target, either slot or resource, default 1

After showing this to a fellow HPDC attendee (@dchapp), the target name was initially very confusing. Once I got him in terms of purely slots (and basically ignoring the target idea to start), the interface was very natural to him. I think the 90% use-case will be with slots and only the more advanced users will end up using the --target option (I may be wrong on that though). If that is the case, I think it would be helpful to have a --tasks-per-slot option which is essentially an alias with --tasks-per-target, which (potentially) is mutually exclusive with --target. In any user documentation, the --target option should probably be presented towards the end, if not in its own section, to avoid confusing users with simpler resource specification needs.

I am also leaning towards changing --shape to --slot-shape, that way it is clear when reading the final command line: flux run --shape Core[2] --nslots 10 myapp vs flux run --slot-shape Core[2] --nslots 10 myapp In the former, it is ambiguous as to whether the shape means the shape of the entire allocation or the shape of the slot.

One other point brought up by @dchapp was, in the case of --shape (Core,Memory) what does Memory default to? If it defaults to 1 byte, that's not overly useful and could be harmful if the job is actually contained to that exact amount. I wonder if we make an exception for that specific resource (i.e., the default is 1g).

garlick commented 5 years ago

Unfortunately not super user friendly if you want to specify a fractional amount like 4 hours and 30 minutes (i.e., 270m in FSD).

4.5h would be valid FSD as well since the number is defined as floating point.

Would it make sense to add a new subcommand to flux-jobspec that implements the proposed command line as the first prototype step? That makes it easy to play with...

trws commented 5 years ago

Good feedback, thanks! I'll update it to slot-shape, it's certainly easier to tell what that means.

What would you think of getting rid of --target as it is, having --tasks-per-slot and --tasks-per-resource where the latter would take an argument of the form <resource>:<num> or similar? I keep circling around how to deal with that split, and I like the idea of having the two be mutually exclusive, so maybe make both the per-resource options one?

As to memory, I think we had a concept of a default unit at some point, when it comes to memory or storage having a default unit of gigabyte seems pretty reasonable. Will have to think about that. The first version wont support memory as an option since V1 jobspec doesn't, but when we get there I would think we'd do something like that.

On flux-jobspec, sure I don't see why not. It would probably make some of the refactoring easier to build the second part in there anyway now that I think about it.

dongahn commented 5 years ago

Great start!

Yes I agree that dovetailing this with flux-jobspec will make later testing easier.

In terms of the esource types names, I prefer using the actual names being used in other places like schedulers. Right now we are using lower cases like socket instead of Socket.

Same comment on how to specify the time as @SteVwonder.

trws commented 5 years ago

Good point @dongahn, I had forgotten the implementation used all-lower, updated the description to match, and added the flux duration up there. Do we have an existing library for parsing the durations?

dongahn commented 5 years ago

Do we have an existing library for parsing the durations?

Good question. @SteVwonder or @garlick? I need to modify libjobspec as well.

SteVwonder commented 5 years ago

Do we have an existing library for parsing the durations?

That’s a good question. Turns out, we do. In libutil. https://github.com/flux-framework/flux-core/blob/4e01f7517c6ab3e3b5b56a67282a9205125a1498/src/common/libutil/fsd.h

What would you think of getting rid of --target as it is, having --tasks-per-slot and --tasks-per-resource where the latter would take an argument of the form : or similar?

That sounds like a good plan to me.

trws commented 5 years ago

Cool. Will definitely use, although I'm not sure we want to continue allowing NaN and infinite to be valid durations...

SteVwonder commented 5 years ago

Agreed that NaN shouldn't be valid. Infinite though, that could be useful for more cloud-like persistent workloads. 🤷‍♂

garlick commented 5 years ago

Maybe the tiny FSD RFC should be updated with 1) NaN not allowed, and 2) decimal point optional.

Jim

On Fri, Jun 28, 2019 at 12:37 PM Stephen Herbein notifications@github.com wrote:

Agreed that NaN shouldn't be valid. Infinite though, that could be useful for more cloud-like persistent workloads. 🤷‍♂

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/flux-framework/flux-core/issues/2213?email_source=notifications&email_token=AABJPW35LLND4IK42PMFF4DP4ZR5ZA5CNFSM4H4AFQQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY27O4I#issuecomment-506853233, or mute the thread https://github.com/notifications/unsubscribe-auth/AABJPW7VNLR4HODKF2CGRE3P4ZR5ZANCNFSM4H4AFQQQ .

dongahn commented 5 years ago

Maybe just state positive normal FP with decimal optional.

There are quite NaNs, signaling NaNs, overflow, underflow etc as defined by IEEE754. They are considered not normal and we should 't allow any of these. If users want inf, they can just specify a very large value?

trws commented 5 years ago

I’ll propose an update and a patch, it should be pretty straightforward. My preference would be (!(zero||nan||inf)). I don’t see a strong reason to explicitly disallow subnormals, they really shouldn’t come up, but if they do I don’t know why we’d choke. Did you have an issue in mind Dong?

dongahn commented 5 years ago

I have no issue. Wanted to reduce the wording. Fine with your proposal.

Dong

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had activity for 365 days. It will be closed if no further activity occurs within 14 days. Thank you for your contributions.

vsoch commented 1 year ago

@trws I think what might make is to define this spec as a file the user can provide to these commands? I might be able to just do:

$ flux run -f myjob.yaml

And then have the standard be that any attribute provided in yaml could also be mapped to the command line. I'd also argue that this file format should be accepted by anything that can submit jobs, and then maybe have an extra boolean or something for batch, etc.

Pinging @alecbcs he will be excited for this design discussion!

trws commented 1 year ago

We actually have that, that's what jobspec is and it's what we already generate from the mini commands, this is meant to be the command that either takes a file like that, or lets you define a basic structure on the fly in a way that uses the same conceptual model. There are a couple of RFCs and many discussions, but the most relevant ones are:

The trick is how to make it comfortable for a user to specify something that in the common case is a small slice of a tree. For example, given the shell escaping issues we've been having, I think this would go badly flux run --nslots 4 --shape socket[2]>core[4] even if it's well specified and relatively easy (to me) to read because of all the special characters in it. Either way, we could start with taking resource-spec from a file and work out the rest with that maybe, then work up a way to specify a short-form shape on the command line?

vsoch commented 1 year ago

flux run --nslots 4 --shape socket[2]>core[4]

I don't think you can do that given the characters, and also having it specified on the command line wouldn't be reproducible. If we require a file that encourages reproducibility (good or better behavior than not?).

Either way, we could start with taking resource-spec from a file and work out the rest with that maybe, then work up a way to specify a short-form shape on the command line?

I think that's what I was suggesting! And this might be an unpopular opinion, but is yaml allowed for this spec? I think we should follow suite of what a lot of cloud native technologies do where everything is in YAML. I know it's not as good as json for large batch cases, but I think for a one off (single) file to be read once it's OK. This also kind of relates to the workflow SI you were describing right?

vsoch commented 1 year ago

The other thing that comes to mind is that, at least for the ones (Jobspecs) I've seen, they don't seem like it would be fun for a human to generate. So I'm not sure if we really should be using the same thing here. Even for the simple cases here: https://flux-framework.readthedocs.io/projects/flux-rfc/en/latest/spec_25.html I couldn't imagine wanting to write them, and I can't imagine more complex cases. I'm wondering if there is an easier way to have these generated with some automation (and then the user can customize) but the main file they submit doesn't need to be so complex.

trws commented 1 year ago

YAML is what we already use, and have since ~2015 😄

I agree that a file is better for reproducibility, but it's not good for doing exploratory work or one-off commands which are also quite common. I'm happy to start with files, possibly allowing resource spec or full jobspec, but I really think we want to have some way to specify resources the canonical way from the command line without having to produce a file first.

garlick commented 1 year ago

flux run --nslots 4 --shape socket[2]>core[4]

oar has a nice and very unslurmish way of expressing resource requirements on the command line FWIW.

trws commented 1 year ago

Oh yeah, those are canonical jobspec, that's after the fill-in-the-blanks part is done. We went back and forth on syntax for a long time, but conceptually where we landed was that something like: Node>Core[5] should be expandable to a full canonical jobspec with the appropriate ranges resources and levels and an implicit slot enclosing the set. The prototype parser I built would also have accepted:

Node:Core[5]

and expanded to

resources:
  - type: Node
    count: 1
    with:
      - type: Core
        count: 5
# etc.

Some old discussions in #354 and the actual interface experiments over here.

Also quite right @garlick, OAR's looks a lot like a path, and we could do much the same as long as we figure out how to handle branches and certain differences in how we handle ranges and positioning. I think their equivalent of that would be oarsub -I -l nodes=1/core=5. To be honest, looking at it now I think they got it right from a character selection standpoint.

My thoughts back then were heavily influenced by how much I turned out to love using cypher as a language for querying graph structures, so I was going for something similar. In there a similar thing would have been match (n:node)->(c:core) WHERE count(c) > 5 return n.id.

vsoch commented 1 year ago

Yeah that is exactly what I mean! I think we would want the spec to be able to handle a "hard core" user writing it out in detail, but for the average user that absolutely does not want to do that, they should be able to just say like:

resources:
  shape: "Node:Core[5]"

And more likely they can write a file once alongside their workflow for others to use. And then that file can also have whatever easy parameters there are (that aren't directly related to the resource).

trws commented 1 year ago

Right, and for the people who absolutely love their in-file directives, we'll probably need a way to manage that too, maybe do a FLUX start/end sentinel with a HEREdoc or... not sure:

#FLUX: ---
<<EOF
resources:
  shape: 
EOF
#FLUX: ---
garlick commented 1 year ago

I guess it's somewhat of a gap but our front end commands (batch/submit/run) don't take jobspec as input AFAICT. One idea might be to add a --jobspec=FILE option that by default accepts a filename that is parsed as JSON or YAML jobspec in the native (currently v1) form, but optionally could accept a URI, where the scheme refers to a translator.

The translator would take some new form as input, and produce the native jobspec form as output. To add a new way to express resource requirements, you'd just provide a translator plugin in python. So for example, I could say flux submit --jobspec=frisbee:foo.yaml to invoke the frisbee translator.

garlick commented 1 year ago

If the jobspec could be provided on the command line as well then I think the RFC 36 triple quote support would let that just work.

Edit: responding to @trws's point about in-file directives.

vsoch commented 1 year ago

I think instead of --jobspec I'd advocate for -f so it follows what many cloud native tools use - people are used to specifying "the same thing" via a file with -f. The reason is because a large set of users already have that flag in their mental map for "do this from this file" and it will make the adoption and learning easier.

trws commented 1 year ago

Yup, I think that's a good general direction. Though I think having a "default" translator would be a good thing, something that does basic expansion but nothing super fancy, and probably a "pedantic" or "canonical" or similar to say "this is canonical, if it's not, fail and tell me."

Oh and I'd missed the triple-quote syntax on that RFC, that would definitely work, though I was looking for a way we could allow it to not have to be inside a comment block to help support syntax highlighting it and having the block actually be valid copyable yaml.

We could have it be --jobspec and have a --file -f alias and have either work? I'm debating whether we should also support --resourcespec or similar to specify a shape in a file without specifying the actual job details like how many tasks, command, etc. or not. Not sure. I know I want a way to specify a shape on the command line, but is it worth taking just the shape from the file rather than the full jobspec or should we have it read jobspec but allow providing or overriding those parameters on the command line?

garlick commented 1 year ago

That idea may have been a little half-baked. I was equating jobspec with the resource request in my head, not taking into account the tasks and attributes sections. The currently tooling does a good job of populating those sections, so hrm...

vsoch commented 1 year ago

For a lot of the tools I develop, they have some central config you can get/set values:

$ shpc config set parameter value
$ shpc config get parameter
> value

And then for any command, you can override a default "one off"

$ shpc -c parameter=onions install <container>

So maybe something like that would work here, except instead of a default config file, we have a default set (or not set) in a jobspec file.

vsoch commented 1 year ago

I've never written a job spec - can we walk through the user process in needing one? E.g, is it something like:

  1. I want to control how my job runs on the resources
  2. I have prior knowledge that a "jobspec" is what I want - I google for it
  3. I copy paste an example an try to edit it\
  4. I save it alongside my workflow for future me and others
  5. I can "one off" any parameter in it to make a on the fly change.

I think the case I'm concerned about is:

  1. I want to control how my job runs on the resources
  2. I have no idea about a jobspec
  3. I stop there

And maybe the solution for that would be to have a UI that can help someone walk through creating a basic one? I've showed this before, but it's fairly simple to do with Vue.js: https://researchapps.github.io/job-maker/ so it would be like:

  1. I want to control how my job runs on the resources
  2. I have no idea about a jobspec, but there is a big button all over the docs that I'm drawn to check out that says "Create a Flux Workflow file" (I don't know what a jobspec is, but "Workflow file" makes sense to me. I also see this file (and link) mentioned in just about every tutorial so I know it's important.
  3. I click through the UI, when it's complete I can click "Download" and commands show me how to use it (and apply a one off fix).
  4. I go do that, I'm a happy user :)

We could call the default name jobspec.yaml if that seems like the right name - I like workflow files that use consistent names OR extensions because if you see it, you know you have a workflow of X type (e.g., Snakemake, Dockerfile, Singularity, main.nf (Nextflow), etc. But TLDR - right now (at least by my perception) these jobspec files are not popular are known - most people directly flux submit / run something and only expert users know about this file. I think that needs to change so it's on the level of understanding for the average user, and easy to make / tweak.

trws commented 1 year ago

At the moment, I think it's essentially an internal detail unless you're using the APIs, so yes we would need to do something about teaching how to use it. Honestly that's why I originally focused on providing a command-line interface that would map more 1-1 onto it rather than going directly to having users provide a full file. We have documentation of the format, we have examples, but haven't written up user guides or anything because they haven't really been user-facing up to this point.

vsoch commented 1 year ago

I think what I'm saying is that (I think) we definitely want something that is user facing. A common way to specify your needs for a job in a file. And it should be encouraged.

trws commented 1 year ago

Yes, I agree with that, it's a big part of why we designed jobspec the way we did. That said we need to do the legwork, and have the discussions with everyone, to work out how to make a "simplified" version that users would actually be willing to write, and implement a translator or preprocessor that would make that work. In fact I implemented a prototype one in ~2019 but, to my shame, never finished the job to actually make it into the flux run/submit commands.

At the same time, it's a source of no small frustration that most users are using flux with slurmish arguments right now. That introduces ambiguities in what they're asking for, and has a tendency to leave users upset when those ambiguities aren't filled in the way they guessed. That's why having a way to specify less ambiguous resource requests on the CLI is a priority to me as well.

The last part of this is that we have a lot of users who wont touch a yaml file for this with a 10' pole. They've written csh since before either of us were born, and if they can't specify their job in comments in a script then we will hear about it (in fact we did #4942). This is part of why RFC 36 comments exist, and they provide at least one way to specify your workflow in a (script) file already, they just don't support specifying them with resource-spec or job-spec right now, and adding the CLI arguments would make them support that without other extensions.

All of that said, yes we should support user-supplied jobspec, yes we should encourage reasoning about resources that way and providing us that whenever possible. It will give us much better, more precise information, and prevent many of the mismatches between the way our model works and the way a user reasons about their commands. It is also a big part of what I meant by the specification part of the proposal abstract.

trws commented 1 year ago

Sorry for the book, this is one of my top things on my list of things I haven't managed to see finished that I wish I had. That and seeing users run into issues with whether -N 1, or -c 1, means a whole node or not drives me up the wall because we spent so much time and effort working out what I think really is a better way, but it's still hidden behind the old one. 😞

vsoch commented 1 year ago

Come back down from the wall! It's definitely not a trivial thing to design the spec, but maybe we should decide what is step 1 and then take that step? Even if it's very hard, if we can agree on the ultimate outcome we want, that will advise that step. Maybe I can help?

The last part of this is that we have a lot of users who wont touch a yaml file for this with a 10' pole.

Maybe... an 11 foot pole? Just kidding :) I (unfortunately) don't have these users in mind - I'm thinking of Flux as an open source project, e.g., when my Kubecon talk is seen by thousands of people there are probably going to be some folks that go check it out, and the more it becomes known as an open source project we will have more folks like that. We don't want to lose them. It might seem backwards, but if we can grow a large external / open source community, that is going to potentially give back to the project in a much larger way than any small developer team could. All of that trickles back down to users at the lab (with their 11 foot poles).

They've written csh since before either of us were born, and if they can't specify their job in comments

idk I'm a dinosaur I am pretty damn old. :laughing:

trws commented 1 year ago

LoL, I'm all good with that, and I would love to have your help. It's something we should definitely finish out, maybe add a jobspec file reading option first, then build in a simplifying translator, then actually build out the slot/shape-centric command setup.

vsoch commented 1 year ago

@trws can you show me a few more examples of how (more complex shapes) expand into resource specs? E.g., akin to https://github.com/flux-framework/flux-core/issues/2213#issuecomment-1444351037 with maybe another layer / more attributes? I'll try an easy one to see if I get it:

#  Node[4]:Slot:Core[2]
# same as?
#  Node[4]:Slot[1]:Core[2]

resources:
  - type: node
    count: 4
    with:
      - type: slot
        count: 1
        label: default
        with:
          - type: core
            count: 2

What about the label attribute? (or other attributes for a section - not supported?)

vsoch commented 1 year ago

For keeping track of things - here is step 1 to write a simple spec: https://github.com/flux-framework/rfc/pull/371

vsoch commented 9 months ago

Ping folks on this issue - can we pick up the discussion about shape I linked above? I think this is going to be useful for moving forward.