flux-framework / flux-sched

Fluxion Graph-based Scheduler
GNU Lesser General Public License v3.0
86 stars 40 forks source link

Need a runtime method to change scheduling parameters #337

Closed dongahn closed 6 years ago

dongahn commented 6 years ago

I believe we discussed this at a meeting.

tpatki commented 6 years ago

I might be interested in adding the support for this.

dongahn commented 6 years ago

@tpatki: This may be more important than the BGL issues. This is something that can be enormously useful for the upcoming MLSI Sierra push. If you are willing to work on this, I can tag this as a next release item. Let me know.

tpatki commented 6 years ago

@dongahn: Yes, sure. Was working on the other smaller issues first to see if we can resolve warnings etc, but can switch to this as per your recommendation.

tpatki commented 6 years ago

@dongahn: I'm just looking at t1005 queue-depth and delay-sched parameters as a starting point now. If I get this correctly, we want to have those be accessible dynamically. Will adding these as a part of scheduler eventing in sched.c in the flux_msg_handler and registering an event be the right direction?

dongahn commented 6 years ago

Great star! I would use an actual request/response instead of eventing though. Take a look at the exclude/include handler within sched as an example.

tpatki commented 6 years ago

@dongahn: How should I expose these two parameters for testing? I can see that sched-include/exclude/cancel as exposed through flux wreck subcommands, which makes sense. Should this be supported on the lines of: "flux wreck queue-depth 4", or do you have other ideas there?

dongahn commented 6 years ago

Good question. Since we don't yet have a sched specific front end command, I have to think piggybacking with flux-wreck would be reasonable.

Maybe flux wreck sched-attr key1=val2,key1=val2

@grondo, @SteVwonder?

SteVwonder commented 6 years ago

That is a great question. We should probably open a separate ticket for how we want to handle this long term (wreck is going away with the exec redesign, right?), but I agree that adding it to wreck is good for now.

I also like the suggestion of the sched-attr because then we can potentially leverage the existing sched_params_args to do the parsing.

What about sched plugin parameters? I could see changing the reservation depth at runtime being a useful thing. Maybe for that flux wreck sched-plugin-attr key1=val2,key1=val2?

dongahn commented 6 years ago

That is a great question. We should probably open a separate ticket for how we want to handle this long term (wreck is going away with the exec redesign, right?), but I agree that adding it to wreck is good for now.

Yeah we need a ticket for the long term solution.

What about sched plugin parameters? I could see changing the reservation depth at runtime being a useful thing. Maybe for that flux wreck sched-plugin-attr key1=val2,key1=val2?

I have a slight reservation to add different subcommands for sched param and sched plugin param. If implementable, an alternative would be use hierarchical keys while using the same subcommand (sched-attr)?

flux wreck sched-attr queue-depth=1 fcfs.its_attr1=100

BTW, while we are at it, we might also need query. Maybe we can also just use one subcommand:

flux wreck sched-attr key1 key2

I am sure such subcommand structures have precedence in flux-core. If there are good examples, it would be good to follow the convention...

SteVwonder commented 6 years ago

I am sure such subcommand structures have precedence in flux-core. If there are good examples, it would be good to follow the convention...

Yeah, totally agree on that. I was trying to think of core-side modules that allow changing parameters at runtime that aren't flux-broker attrs, but I couldn't think of any. @garlick? @grondo?

I have a slight reservation to add different subcommands for sched param and sched plugin param. If implementable, an alternative would be use hierarchical keys while using the same subcommand (sched-attr)?

That's fine with me. I think this ultimately harkens back to our discussion on how to handle config/attributes consistently throughout Flux (i.e., https://github.com/flux-framework/flux-core/issues/1295)

grondo commented 6 years ago

I was trying to think of core-side modules that allow changing parameters at runtime that aren't flux-broker attrs

If the parameters aren't set by broker attributes, then I would think each module would have a utility command that would modify those parameters in a way that makes sense.

One example that comes to mind is the cron sync command, which modifies the sync handler in the cron module at runtime. The default sync and sync_epsilon can also be set as a module parameter.

However, it might be nice to have helper functions for modules to allow tunables to be exported from modules voa a table, with an implied module.attributes service which could list and modify the attributes, with a callback in the module to handle updates, so that simple module-specific attributes don't have to come with a burden of writing a lot of extra code. (Maybe this is handled by broker attributes? I didn't look back at flux-framework/flux-core#1295 yet)

You can tack the sched-attr subcommand onto flux-wreck for now if you want, but like @SteVwonder said, that command is going away in the near future. At some point the flux-sched project will want to install its own utility -- perhaps a flux sched <command>? (I think flux-core exports liboptparse so that development of the subcommand should be trivial)

dongahn commented 6 years ago

You can tack the sched-attr subcommand onto flux-wreck for now if you want, but like @SteVwonder said, that command is going away in the near future. At some point the flux-sched project will want to install its own utility -- perhaps a flux sched ? (I think flux-core exports liboptparse so that development of the subcommand should be trivial)

Yes this will be the future and we need a ticket for this one as discussed.

This is a one-off task @tpatki is doing to help the current Sierra users in the next few weeks.

dongahn commented 6 years ago

@grondo: BTW, I don't see the release target to tag for flux-sched that can go with flux-core 0.10.0. How do you create one for sched?

grondo commented 6 years ago

You can create a new milestone here: https://github.com/flux-framework/flux-sched/milestones (I know, sometimes it is hard to find. A Milestones button appears on the top-level issues page for all GitHub projects)

tpatki commented 6 years ago

I'll add this as a wreck subcommand for now. I think we should be able to reuse most of the sched-side code (except the front-end utility/command) in the future. Thanks everyone for the pointers!

tpatki commented 6 years ago

One more question: do we want to preserve the static options (e.g. flux module load sched sched-params=queue-depth=4), or should everything be tuned through a flux wreck sched_attr?

Also, if we want to preserve it, we should probably use the same nomenclature for ease of access, as in, flux wreck sched-params instead of sched-attr.

SteVwonder commented 6 years ago

My vote is to keep the capability of passing arguments to sched when it is loaded. Seems like a usability win to support it in both places. It would also help with parameters that only make sense at load time.

I also agree on making the naming consistent.

tpatki commented 6 years ago

Right, agreed. That's my vote too but wanted to get an opinion on it. I'll make it consistent and preserve the current static option.

dongahn commented 6 years ago

Yes. It would be wise to keep them.

tpatki commented 6 years ago

I have a simple first cut working with a single parameter, and currently it doesn't use sched_parse_params (working on changing that now).

For the query option that Dong suggested, I am planning on creating a different request altogether within sched. This means I'll have to parse the options while in the flux-wreck command, but this seems to be a good approach as we can reuse the sched_parse_params for the first request (set-parameters), and we can write a different function for returning the values when its a query.

With a single request, we'll have to modify sched_parse_params to check for both sets of arguments (eg queue_depth and queue_depth= ), and returning multiple values may be difficult. Also, this would mean that we would have to add in more return-value checking when the sched parameters are set when the module is loaded (i.e, static setting).

Just checking to see if the former is the right way to go. Thoughts, @dongahn, @SteVwonder?

SteVwonder commented 6 years ago

This means I'll have to parse the options while in the flux-wreck command, but this seems to be a good approach as we can reuse the sched_parse_params for the first request (set-parameters), and we can write a different function for returning the values when its a query.

:+1: Sounds like a good plan. The duplicated parsing code seemed odd to me at first, but I think it is unavoidable considering the set of options we will want to support at runtime and at load time might diverge.

tpatki commented 6 years ago

Right, it seems unavoidable if we want to support a query.

tpatki commented 6 years ago

@grondo, @dongahn, @SteVwonder:

I was wondering if I should design runtime updation/query with a one subcommand or two.

Specifically, should I go with one subcommand such as flux wreck sched-params --queue-depth 8 or flux wreck sched-params --queue-depth, where the former will see an argument and invoke a handler that sets the queue-depth, and the latter will see no arguments and can be treated as a query that returns the queue depth.

I can of course make this much simpler by using two subcommands, such as flux wreck set-sched-params --queue-depth 8 and flux wreck get-sched-params --queue-depth.

Is there a preference (going with the first one for now).

grondo commented 6 years ago

@tpatki: Since we are using flux wreck command to set these parameters to allow rapid development, you should probably take the approach that is simplest for you.

If I had to choose, I would suggest a set and get type interface like

flux wreck sched-param get [ITEM] flux wreck sched-param set ITEM=value

get (or maybe list) can get the current value, with no args lists all items and their values.

set would work as you expect.

(set with ITEM=VALUE allows easy extension to future module parameters)

However, I leave it up to you since the flux wreck command will be short-lived anyway.

dongahn commented 6 years ago

I like @grondo's suggestion using a single sub subcommand (sched-param).

tpatki commented 6 years ago

@grondo: Agreed, this is what I'm going with. With flux wreck sched-param get [ITEM] I'll need to define another set of subcommands (get and set within the sched-param), so I'll keep it simpler than that. flux wreck sched-param ITEM will return the current value, and flux wreck sched-param ITEM=VAL will update the current value (no sets/gets).

Also, even though flux wreck is going away, I think it is easy to move these subcommands to a new flux-sched utility. I will look into adding that utility for #361 and then adding similar subcommands once I'm done with this current issue.

grondo commented 6 years ago

@tpatki, sounds good! It might be nice for users if flux wreck sched-param dumps all current values to stdout (if that is possible with the current implementation)

Let me know if you need any help with lua string matching functions, but there is an example of parsing key=value pairs in the setenv subcommand.

tpatki commented 6 years ago

Thanks @grondo, I looked at the key=value example, thanks for that example.

I was just using simple options like the rest of the subcommands for my first cut (eg --queue-depth 8 --delay-sched 0).

Is there an easy way or example to enable something like the sched-params get/set that you suggested earlier? I wasn't sure if I would need a nested subcommand to address the get/set and parse options further down, especially forflux wreck sched-params set ITEM1=VAL1, ITEM2=VAL2?

If there's an easy way to do that, let me know.

grondo commented 6 years ago

In your SucCommand handler for sched-params, the returned arg parameter is a table/array of the rest of the arguments from the command line. You can check to see if arg[1] is set or get manually there, and then take appropriate action.

Then, for the set case, like setenv you can iterate the rest of arg, split each element into var, val with local var,val = expr:match("(.*)=(.*)") instead of requiring the user to use --key=value. This is also much more easily extended to new parameters (code doesn't have to change)

tpatki commented 6 years ago

Makes sense, let me try that.

Another question. In the callback corresponding to the sched-params get in sched.c, how should I pack and return/respond with multiple values so they can be unpacked in the subcommand for printing? I saw the simple example of using resp.value in the subcommand, but I'm not sure how to handle multiple values (eg queue-depth, delay-sched).

grondo commented 6 years ago

The json response will be unpacked as a table to Lua so you can iterate over it with ipairs (or you should be able to). I have to head out for a bit, but when I get back I will try to find an example.

tpatki commented 6 years ago

Sounds good! Thanks for all the help and patience with my questions :)

grondo commented 6 years ago

No problem! It occurred to me that a good example for you (on the sched.c request handler(s) side) would be the getattr,setattr, and lsattr handlers in src/broker/attr.c.

tpatki commented 6 years ago

Sounds good, thanks!

tpatki commented 6 years ago

@dongahn, @SteVwonder:

I wanted to clarify the logic for handling updation of delay-sched at runtime. By default, delay-sched is set to true, which means that the three watchers (before (prepare), after (check) and idle) are created/started by default if no parameters are specified. The ctx->after (check watcher) may invoke schedule_jobs.

We now have three possible transitions at runtime for delay-sched. Note that runtime updates can happen multiple times (that is, something like true-->false-->true-->true is possible). I'm listing what needs to happen in each scenario, let me know if I missed anything.

1) Case 1: true --> false

2) Case 2: false --> true

3) Case 3: false --> false or true --> true

Let me know if I missed something, and also suggestions on how to test this workflow correctly.

SteVwonder commented 6 years ago

That all sounds reasonable to me. As for how to test it, that is a great question. I don't think we test delay-sched works "performance-wise" right now (specifically that it queues jobs and processes them in batches when the scheduler is idle rather than one at a time). @dongahn would know for sure though, so I'll let him chime in.

tpatki commented 6 years ago

We don't test delay-sched or queue-depth for correctness at the moment, and that's where my question was coming from. I am testing with log messages/step through, but I'm not sure how to develop a test case for it.

dongahn commented 6 years ago

@tpatki: this sounds reasonable as well.

I don't think we need a performance test for delay-sched. I do think we do have some correctness coverage in the send that we look at whether the cores on the nodes are all scheduled with or without this option.

It seems the new test coverage should be whether the prepare/check/idle watchers are called when delay-sched=true. Maybe the set request handler can check whether the watcher states are appropriate before returning the RPC and the test can be written based on the returned code from the handler...

tpatki commented 6 years ago

Thanks, @dongahn & @SteVwonder! @dongahn: Yes, adding a test to check the states of the three watchers and setting a return code before the RPC makes sense. I was printing out info there, I'll change it and add an appropriate test case. Thanks for the prompt, late night replies to my questions -- really appreciate the help.

dongahn commented 6 years ago

362 resolved the issue.