utexas-bwi / rocon_scheduler_requests

Interfaces for managing rocon scheduler requests
http://wiki.ros.org/rocon_scheduler_requests
0 stars 3 forks source link

How to cancel resource requests #14

Closed stonier closed 10 years ago

stonier commented 10 years ago

Bit unsure how this is done. Actually, totally unsure. Got a cookie for me Jack?

jack-oquin commented 10 years ago

Call rq.release() on it.

stonier commented 10 years ago

Is this an operation that should only be done from within the safety of the requester feedback callback?

That is, is there a better way to expediate the sending of this command?

jack-oquin commented 10 years ago

It can be done from anywhere.

Since the callback runs in a normal rospy callback context, it should be safe with respect to other normal activities of the node, but I am sure there are potential serialization issues if your node does its own threading.

The Requester.rset data member gives access to all currently active requests. If you know the UUID of the request, your node can do this:

try:
    rqr.rset[request_id].release()
except KeyError:
    print('request no longer exists')
except TransitionError:
    print('request not in a releasable state')
rqr.send_requests()

Always use the Requester.send_requests() method after updating one or more requests. That doc should say it is useful after any request update, not just calls to new_request().

jack-oquin commented 10 years ago

We could add a Requester.release(id) method. It would really only replace the rqr.rset[id].release() statement, but would be easier for users to discover.

I am more inclined to improve the documentation with better usage examples, like the one above.

What do you think?

stonier commented 10 years ago

Yeah, I was actually trying something similar:

    for request in self._request_set.values():
        if request.msg.status == scheduler_msgs.Request.GRANTED:
            request.release()
    self._requester.rset.merge(self._request_set)
    self._requester.send_requests()

I think this is fine for now - if we see some patterns emerging in most every requester implementation we can stuff them in THE Requester later.

Worrying about whether the rset is going to get mangled if attacked from program threads and ros callback threads. I don't think I can guarantee this from outside. For instance, the feedback code in requester:

    new_rset = RequestSet(msg.requests, self.requester_id)
    prev_rset = copy.deepcopy(self.rset)
    self.rset.merge(new_rset)

    # invoke user-defined callback function
    self.feedback(self.rset)

    if self.rset != prev_rset:      # msg or callback changed something?
        self.send_requests()        # send new request immediately

There is a merge there on the requester's rset, that I can't protect from conflicting with usage outside the class in a potentially different thread

jack-oquin commented 10 years ago

Many applications will do all their work from various rospy timer or topic callbacks. I believe none of them run concurrently, they share the rospy.spin() thread.

For more-complex threaded rospy applications, I feel sure there are serious serialization issues to consider.

A new Requester.release_all() method would be useful for most applications.

stonier commented 10 years ago

Yes, I like the idea of a Requester.release_all().

Or maybe Requester.cancel_all() which would appropriately release on granted requests and cancel other requests?

jack-oquin commented 10 years ago

Interface issues like these become much clearer with real-world examples.

Currently, "release" applies to any request, whether WAITING, RESERVED, GRANTED or PREEMPTING.

We should pick a consistent verb to use. "Cancel" might be a better choice, since "release" only seems appropriate for GRANTED requests. I have no objection to making that change. We could leave the other method around as a deprecated synonym, if that helps the transition.

Maybe "close" would be a good verb. The requester may be finished or may be interrupting some incomplete operation.

If we do change the verb, we should probably update the Request.status constants, too. That will be somewhat more disruptive.

stonier commented 10 years ago

I can't think of any case where I'd want to release and not just more generally cancel.

Note - I tried releasing on a non-granted request (can't recall if it was actually NEW or WAITING), the requester responded with an invalid transition error.

jack-oquin commented 10 years ago

WAITING -> RELEASING is currently allowed (see transitions.TRANS_TABLE), but NEW -> RELEASING is not. It probably could be, but the table assumes specific transitions happen on one side or the other of the request protocol. When the scheduler sees a NEW request it will reply with some other state: ABORTED, GRANTED, PREEMPTING, REJECTED, or WAITING.

I had not previously considered the case of a requester cancelling a NEW request before receiving updated feedback from the scheduler. That suggests adding additional state transitions, which seems OK, but I need to think about the implications some more.

stonier commented 10 years ago

Just for your information - looking at when the requester's owner gets an instruction to shutdown. At this point it needs to cleanup everything, i.e. releasing and cancelling all requests. That signal can come anytime, even 1ns after a new request has been made. I am reluctant to either wait for the feedback callback (too long in a shutdown process) nor speed up the frequency just for this.

jack-oquin commented 10 years ago

Good point.

My initial reaction is that transitions within requester-generated states (or scheduler-generated states) are mostly always valid and should be added to the transition table. But, I do need to think about that some more. Maybe there should be separate tables for the two sides of the interface.

A stickier issue is when any node can continue sending and receiving messages at various points during the shutdown. If the service has received a stop message, rospy is presumably still active, i.e. not rospy.shutdown(). But, eventually, all that infrastructure is going to disappear.

At some point we will need to rely on missing heartbeat messages to clean things up. That will take a long time, perhaps sixteen seconds (four missed messages at 1/4 Hz frequency) ; I have not implemented that yet, see: #7.

So, a requester shutting down intentionally should definitely release its resources cleanly, if it can.

jack-oquin commented 10 years ago

The commit above provides much additional documentation along the lines discussed here.

Formatted output is here: http://farnsworth.csres.utexas.edu/docs/rocon_scheduler_requests/html/requester.html

jack-oquin commented 10 years ago

The commit above defines the ResourceRequest.cancel() method, and deprecates release() but still accepts it.

It also defines many more valid status transitions, including NEW -> RELEASING.

No scheduler_msgs/Request changes are assumed. At some point, those labels should be organized better, with some being renamed. The old labels can be kept around for a while during that transition.

jack-oquin commented 10 years ago

@stonier: is this issue resolved?

stonier commented 10 years ago

:+1:

jack-oquin commented 10 years ago

We now have Requester.cancel_all(), which recovers resources when a requester signs off.

But, there is no explicit mechanism for a Requester to tell the scheduler it is gone. Eventually, it will figure it out after enough heartbeat messages have been missed. The Requester.unregister() method merely stops the heartbeat messages and is only useful for testing the scheduler. I'm going to rename it _unregister() to avoid API confusion.

I can't think of any reason why the scheduler needs to know sooner. The overhead for an inactive requester with no resources assigned is small. I just wanted to make a note of it, because it is relevant to this discussion.