utexas-bwi / rocon_scheduler_requests

Interfaces for managing rocon scheduler requests
http://wiki.ros.org/rocon_scheduler_requests
0 stars 3 forks source link

Request priority configuration #4

Closed stonier closed 10 years ago

stonier commented 10 years ago

Currently the requester sets priority flags in its constructor. We need some finer control over priorities:

1) At the request level - i.e. at new_request.

and/or

2) At the resource level - embedding a priority flag in Resource.msg.

Probably both is useful. I can already think of ways to use 2), but it's not something that every scheduler would need to use. 1) could be used to write a simple scheduler, or to provide a way of bumping priorities for a whole group request regardless of the individual resource priorities.

Thoughts?

jack-oquin commented 10 years ago

I am not yet persuaded 2) is a good idea. Is it for handling "ranged requests"?

We discussed using priorities for that, but I believe there are better and simpler solutions.

stonier commented 10 years ago

Suppose we don't have ranged requests. Do we need individual priority flags? It sounds like something I'd want to do, but I can't think of a concrete use case where I'd really need them right now.

If we do have ranged requests, then I would like to be able to prioritise the necessary resources and the optional resources with different priorities. e.g. 3-5 robots.

Case 1) I might set 3 to high priority with a necessary flag, and two to really low priority with an optional flag so it doesn't steal robots from services with higher priorities.

At the same time it would still be good to support 2) setting high priorities for all 5 robots in the case where I want to make sure my service even gets the optional robots before other services get their requests fulfilled.

piyushk commented 10 years ago

+1

jack-oquin commented 10 years ago

I understand the semantics of per-request priorities, but can't explain what per-resource priorities would mean.

Suppose the three high-priority robots in our example are available. The request can be granted, but one of the other two is missing or allocated to some other task. At that point, the request should be granted, with the four available robots. The other is dropped from the request. Priorities do not enter in.

Should the scheduler instead continue to hold up the entire request because the optional missing robot had a high priority? I don't think so.

For a resource to have its own priority, we need to be able to process it as a separate request which can be granted at a different time. Our current protocol does not allow for that.

I can imagine requesters specifying very elaborate constraints on possible resource matches, like:

((two foo or one bar) and 3 baz) 

But I do not believe adding a per-resource priority would help much with that.

stonier commented 10 years ago

Ah, that is an interesting and elaborate resource matching constraint.

With our example I wasn't thinking of holding up the request...rather manage it continuously. e.g. a common startup scenario for us is:

Request A :  3xfoo@(15,necessary) + 2xfoo@(10,optional)
Request B : 1xfoo@(14, necessary)

With a flow like:

I feel it's very important to be managing the requests continually - robots will be invited and uninvited at irregular intervals and we will also consider pre-emption at some point.

Note in the above, if we distinguish necessary vs optional, then perhaps necessary requirements derive some sense of priority - they should get filled before optional requirements.

jack-oquin commented 10 years ago

To do something like that requires a different data structure and more states. Our existing states are discrete: a request can be WAITING or GRANTED, but not both.

I don't understand the parenthesized numbers in your example, or why request A cannot be divided into two or three separate requests with different priorities.

If subordinate parts of a request really need to be handled separately, we should reconsider my earlier idea of defining sets of requests. I prefer our current design, because it is much simpler. But, it really does not accommodate the scenario above.

piyushk commented 10 years ago

Given Jack's last comment, I now understand the idea behind request-based priorities instead of resource-based ones. If I understand your point correctly Jack, if a higher priority request comes in, it gets bumped ahead in the queue?

Also, what this means in Daniel's example is that once Robot foo3 connects, A is granted and B comes up at the head of the queue. Once foo5 connects, nothing happens because both request has been satisfied? I would be fine with solution.

Daniel, how useful would these continuous requests be to you? Is it already part of a use-case or do you foresee it sometime soon.

jack-oquin commented 10 years ago

I am making some assumptions about scheduler implementation based on examples I've known.

It is common for schedulers to maintain some sort of "ready queue" containing requests that are waiting for a resource. That queue is often sorted by priority. The usual practice is to assign the first queued request when a resource becomes available. Whether the priorities are fixed and strictly followed (like SCHED_FIFO in Posix), or dynamically adjusted as desired by the scheduler (like SCHED_OTHER) are design choices.

I believe something similar to Daniel's example could be accomplished within our existing framework:

Requester A :  
  request A1: 3xfoo@(pri=15)
  request A2: 1xfoo@(pri=10)
  request A3: 1xfoo@(pri=10)
Requester B : 
  request B1: 1xfoo@(pri=14)

With a flow like:

Note that doing it that way is itself definitely a design choice. Other approaches would prefer to grant [foo1] to B1, so it can run immediately and perhaps finish its task.

There are no absolutely right or wrong answers to these questions. It is mostly a matter of system design trade-offs. That is one reason we want to allow for alternative scheduler implementations.

piyushk commented 10 years ago

I understand, and your solution makes sense to me.

stonier commented 10 years ago

Ok, I'm going to take a step back. I think I was in a hurry to solve things however this particular example highlights your point that there is no right answer.

Just to focus, the problem in english we have, which is probably a very common situation:

My service needs a mobile manipulator and at least 3,
maximum 5 turtlebots to function. It shouldn't grab any
of these robots until the minimum requirements are
met (so that robots aren't sitting around idle locked
into a non-functioning group task - very important).
After allocation, if the mobile manipulator or the 
minimum of 3 should fail to be met, the service should
safely terminate and then deallocate the remaining
resources.

I was immediately trying to group all member resources of that request because they have some correlation to each other and the request - 1) they should not get allocated until group requirements are met and 2) the allocated request should not raise any signals until the minimum requirements fail to be met.

I was also assuming that the request could remain valid while turtlebots dynamically get allocated to replace turtlebots that have left without the service having to manage this.

Jack, I believe the solution you have above is simpler, but will relay management of this turtlebot 'pool' to the service who will have to constantly monitor it's own requests. I imagine it will have to dynamically cancel them when a request's resources all disappear and create new requests on the fly to ensure that it's minimum requirements are always met.

Q) Is doing this 'pooling' management of resources prior to the scheduler the right idea? Implementation wise I imagine it would be like sub-classing the Requester class.

Q) Jack, what happens from the scheduler's point of view if a resource for a request disappears?

stonier commented 10 years ago

Forgot to add thoughts about jack's queue based requests.

jack-oquin commented 10 years ago

Q) Is doing this 'pooling' management of resources prior to the scheduler the right idea? Implementation wise I imagine it would be like sub-classing the Requester class.

That makes sense. I have not thought enough about that side of the pond.

Q) Jack, what happens from the scheduler's point of view if a resource for a request disappears?

I'd been assuming that the requester would discover that on its own, through interaction with the device. If it does not notice, then maybe it does not matter. But, perhaps these services are higher-level than I had realized and don't really interact with individual devices.

From the scheduler's perspective, it would remain allocated and might reconnect. If the requester discovers the device is not responding, it can release that resource and request a new one. Maybe that is not what we want.

Q) Is it relying on the individual foo requests being lower priority than the group request?

The example discussed above relies on priorities and on the scheduling policy being strictly priority-driven, i.e. never satisfying a lower-priority request ahead of a higher-priority one. That is not usually done, but could be. At least with operating systems the normal heuristic is that it's better to act now rather than later, if possible.

Q) What happens if one of the group of three resources leaves?

As above: nothing unless the requester updates that request. That looks awkward with our current multiple resources per request approach. We may want to figure out a smoother answer. On the other hand, if those three robots were really the minimal useful allocation, then releasing them and re-requesting them may be the most logical approach.

jack-oquin commented 10 years ago

I thought about Daniel's idea of providing higher-level wrappers for the Requester class. There are some advantages to that approach:

jack-oquin commented 10 years ago

I had not considered the case of services that do not need to actually maintain contact with the resources they allocate. A concrete example or two would probably clarify those requirements for me.

If that is a common situation, we may need some additional request states or other mechanism for notifying the requester that previously allocated resources are no longer available. It looks similar to the PREEMPTING -> PREEMPTED state transitions, and maybe they would be adequate. But, some indication of the cause might help.

jack-oquin commented 10 years ago

See also the comment about removing priority field from scheduler_msgs/SchedulerRequests in robotics-in-concert/rocon_msgs#46. It's no longer needed.

stonier commented 10 years ago

I've been getting warm and cosy about the higher level wrappers for requesters notion as well. Feels like the right thing to do for all the reasons you listed. Given that and the callback implemented on the other (scheduler) side to handle varying scheduling implementations....starts to look a little bit like posix:

kernel implementations....................posix..................higher level wrappers (qt/boost)

Posix works great because it's the atomic building blocks of the unixiverse (even if it does suck to program with directly).

stonier commented 10 years ago

I'm going to go ahead now with the link graph style services we use for the chatter and turtlesim tutorial services and see if I can implement something along these lines (higher level requester and a handler for scheduling on the other side).

Shall we wrap up this issue?

jack-oquin commented 10 years ago

I am happy with the basic protocol. It will probably evolve based on experience, but it's already enough to do plenty of interesting work.

But, I think we still need to resolve the notification issue: how should services not in communication with their robots be informed of outages?

The scheduler will presumably be monitoring entry and exit from the concert, so it makes sense to use that mechanism for notification, when necessary.

My current suggestion is: rather than add new status transitions, add a uint16 reason field to the Request message. When the status changes, this field can be updated to provide additional information. Reasons might include things like: preempted, lost contact, went away, etc.

jack-oquin commented 10 years ago

With issue #10 now tracking the remaining unresolved question for this issue, I suspect #4 can be closed.