ros / roslisp_common

Clone of the old roslisp_common SVN repo from code.ros.org.
8 stars 14 forks source link

race condition in actionlib:action-client #8

Open airballking opened 10 years ago

airballking commented 10 years ago

I discovered a race condition in the action-client implementation.

What's happening:

I created a minimalistic client provoking this behaviour here: https://github.com/airballking/actionlib_client_tests

Additionally, I added some printouts to some action-client-internal functions to see what's going on. Here's my printout:

\ START PRINTOUT **

* FIRST CALL WITH NEWLY-STARTED SERVER CLIENT GOES FINE * ACTIONLIB-CLIENT-TESTS> (provoke-race-condition)

send-goal: goald-id=/fibonacci-lisp-client_GOAL_13862289470430000000

send-goal: goald-id=/fibonacci-lisp-client_GOAL_13862289480790000000

action-client-transition: goal-id=/fibonacci-lisp-client_GOAL_13862289470430000000, server-state=PREEMPTED, client-state:DONE

client-result-callback: goal-id=/fibonacci-lisp-client_GOAL_13862289470430000000, server-state=PREEMPTED, client-state:DONE maybe-finish-goal: removing-goal goal-id=/fibonacci-lisp-client_GOAL_13862289470430000000

action-client-transition: goal-id=/fibonacci-lisp-client_GOAL_13862289480790000000, server-state=PREEMPTED, client-state:DONE

client-result-callback: goal-id=/fibonacci-lisp-client_GOAL_13862289480790000000, server-state=PREEMPTED, client-state:DONE maybe-finish-goal: removing-goal goal-id=/fibonacci-lisp-client_GOAL_13862289480790000000 NIL :ABORTED

* SECOND CALL LOSES GOAL-HANDLE * ACTIONLIB-CLIENT-TESTS> (provoke-race-condition)

send-goal: goald-id=/fibonacci-lisp-client_GOAL_13862289611860000000

send-goal: goald-id=/fibonacci-lisp-client_GOAL_13862289621880000000

action-client-transition: goal-id=/fibonacci-lisp-client_GOAL_13862289611860000000, server-state=PREEMPTED, client-state:DONE

action-client-transition: goal-id=/fibonacci-lisp-client_GOAL_13862289621880000000, server-state=PREEMPTED, client-state:DONE

client-result-callback: goal-id=/fibonacci-lisp-client_GOAL_13862289621880000000, server-state=PREEMPTED, client-state:DONE maybe-finish-goal: removing-goal goal-id=/fibonacci-lisp-client_GOAL_13862289621880000000 [(ACTION-CLIENT STATE) WARN] 1386228968.586: Goal `/fibonacci-lisp-client_GOAL_13862289611860000000' is lost. Terminating goal.

action-client-transition: goal-id=/fibonacci-lisp-client_GOAL_13862289611860000000, server-state=LOST, client-state:DONE maybe-finish-goal: removing-goal goal-id=/fibonacci-lisp-client_GOAL_13862289611860000000 NIL :ABORTED ACTIONLIB-CLIENT-TESTS>

\ END PRINTOUT **

It seems the result for the first goal-handle (in the second run) is never received. As a result, the goal-handle is never removed and hence eventually lost. Echoing the result-topic, I verified that the server did send result messages out.

Anybody has an idea why the result subscriber of action-client never processes the second result message?

Additionally, when aborting the action-client seems to return to the caller before it received status-updates or result. Is this desirable?

airballking commented 10 years ago

After further digging: The result-callback of the action-client does receive the result-message. However, the call to (has-goal ... ) inside the callback fails. Seems, the goal has not been or is no longer registered when the result-message comes in. I'll keep looking.

moesenle commented 10 years ago

In case of cancellation, does your server send a result message? If yes, is that correct behavior in terms of the action server protocol?

On Thu, Dec 5, 2013 at 10:05 AM, Georg Bartels notifications@github.comwrote:

After further digging: The result-callback of the action-client does receive the result-message. However, the call to (has-goal ... ) inside the callback fails. Seems, the goal has not been or is no longer registered when the result-message comes in. I'll keep looking.

— Reply to this email directly or view it on GitHubhttps://github.com/ros/roslisp_common/issues/8#issuecomment-29881880 .

airballking commented 10 years ago

Yes, the server sends a result message after it got into state PREEMPTED. PREEMPTED is a terminal state of the protocol. The protocol requires the server to send a result message once it reached a terminal state. So, all seems fine from the server.

moesenle commented 10 years ago

And is the client removing the goal from the set of tracked goals on cancellation? If yes, remove that. Is the cancelled goal still on the status topic? If no it's a server issue.

On Thu, Dec 5, 2013 at 10:47 AM, Georg Bartels notifications@github.comwrote:

Yes, the server sends a result message after it got into state PREEMPTED. PREEMPTED is a terminal state of the protocol. The protocol requires the server to send a result message once it reached a terminal state. So, all seems fine from the server.

— Reply to this email directly or view it on GitHubhttps://github.com/ros/roslisp_common/issues/8#issuecomment-29884344 .

airballking commented 10 years ago

I get the feeling that my problem are multiple instances of action-clients on the lisp side. If you look into my example, I create a new action-client every time I call (provoke-race-condition). I played around in the REPL and lost goal-handles appeared once I created the second/third action-client. With only one action-client, I can send and cancel as many goals as I want.

airballking commented 10 years ago

I have used my applications without problem (and quick send/cancel of goals) for a week now. All fine. So, I guess it's paramount to create just one instance of action-client per action.

Can you think of an easy way of enforcing this? Topics keep a register of subscriptions. We could do the same for action-clients and return an error every time a user wants to re-create a specific action- client. It's limiting the behaviour of the lib but clarifying to the user what the intended usage is.

moesenle commented 10 years ago

I don't like singletons and I multiple action clients for the same action should definitely possible. Just to be sure: was anything bad happening ore were there just warnings? Could that even be a bug in roslisp?

On Fri, Dec 13, 2013 at 10:09 AM, Georg Bartels notifications@github.comwrote:

I have used my applications without problem (and quick send/cancel of goals) for a week now. All fine. So, I guess it's urgent to create just one instance of action-client per action.

Can you think of an easy way of enforcing this? Topics keep a register of subscriptions. We could do the same for action-clients and return an error every time a user wants to re-create a specific action- client. It's limiting the behaviour of the lib but clarifying to the user what the intended usage is.

— Reply to this email directly or view it on GitHubhttps://github.com/ros/roslisp_common/issues/8#issuecomment-30495459 .

airballking commented 10 years ago

Nothing bad happened. I just got warnings about dropped goal-handles but the goal was sent out, the server performed correctly, and cancel also correctly stopped the server.

I'm not sure where this bug is coming from, to be honest.