Closed corot closed 3 years ago
I like the idea, but the mutex also protected us from concurrent access to the members of the iterator.
From the quick look we might for example run into problems when calling start and cancel "concurrently" (assuming we first call start).
// inside AbtractActionBase::start
// if slot_it is newly created the execution is nullptr
slot_it->second.execution = execution_ptr;
// inside AbtractActionBase::cancel
concurrency_slots_[slot].execution->cancel();
would be a nullptr deref.
If its not a nullptr, we still might call cancel on a execution, which hasn't started and would then miss the cancel
If its not a nullptr, we still might call cancel on a execution, which hasn't started and would then miss the cancel
check added. we can take care of the possible corner cases; but getting frozen as now is not an option! :skull:
@corot Do you mean with "freeze" as in deadlock (the node then just sits there and never recovers)? If yes, can you explain how this is happening?
@corot Do you mean with "freeze" as in deadlock (the node then just sits there and never recovers)? If yes, can you explain how this is happening?
Yep, the node freezes and don't even react to rosnode kill signal. To be honest, I cannot explain how this happens, because even when we return the cancel, we don't recover (I would expect that the new goal starts then, as the lock is released,,, but somehow this doesn't happen! we never release the lock)
Reproducing is easy, though. Implement the cancel method of your controller, e.g.
bool cancel() {
for (int i = 0; i < 50; ++i) {
ROS_INFO("cancel %d...",i+1/5);
ros::spinOnce();
ros::Duration(0.1).sleep();
}
ROS_INFO("cancel DONE!");
return true;
};
The spinOnce is important, because otherwise MBF won't call start and cancel simultaneously. This sounds like solving the problem, but then your controller won't get ANY info, e.g. velocity feedback.
Then u send any path to exe_path, cancel it and send it again within 5 seconds
After reconsidering this PR, I have another proposal for handling smooth cancel, without the need of blocking the cancel callback #274. Early feedback more than welcome!! :pray:
Closed in favor of #274
We are locking the access to concurrencyslots map when doing any operation, as start and cancel. That's not needed. Interators to map don't get invalidated by adding new entries to the map, and we are not removing concurrency slots other than in shutdown.
With the current master, if we receive a new goal while performing a lengthy cancel (e.g. because we want to handle a smooth stop, as requested on this PR), MBF will freeze, as we lock twice slot_mapmtx, here and here.
With this PR, we accept the new goal, wait for the previous cancel to complete, and then execute the new goal.