magazino / move_base_flex

Move Base Flex: a backwards-compatible replacement for move_base
https://uos.github.io/mbf_docs/
BSD 3-Clause "New" or "Revised" License
434 stars 154 forks source link

Fix number of retries check #262

Closed Timple closed 3 years ago

Timple commented 3 years ago

We had an issue where we wanted to retry the controller zero times. But it gave a MAX_RETRIES error. Which is not correct.

This also solves the TODO in one of the tests.

dorezyuk commented 3 years ago

I'm not sure if that's correct. Consider the case where you allow 1 retry. To me this means that you can try twice.

the code

if(max_retries >= 0 && ++retries > max_retires) { /* max_retries */ }

should be correct and will call our function twice.

The suggested fix

if(max_retries >= 0 && retries++ > max_retires) { /* max_retries */ }

Will call the function three times - or in general MAX_RETIES+2 times. IMO this is very misleading. My initial TODO was there because I confused one retry with "one attempt".

To fix your issue I would suggest to follow the way how the planner solves this by writing

if(max_retries > 0 && ++retries > max_retires) { /* max_retries */ }

Which will only return MAX_RETRIES if they aren't set to zero.

Timple commented 3 years ago

allow 1 retry. To me this means that you can try twice.

Agreed.

Which will only return MAX_RETRIES if they aren't set to zero.

I think 0 retries is a valid entry and should be supported. So, in that case, if a retry is attempted, it should return MAX_RETRIES since none are allowed.

But by the time you reach this part of the code, one attempt has already been made and failed.

So your example of max_retries = 1 let's evaluate the fix:

// run-and-fail
if(max_retries >= 0 && retries++ > max_retries) -> if(1 >= 0 && 0 > 1) -> if(False) -> no error thrown
// run-and-fail-again
if(max_retries >= 0 && retries++ > max_retries) -> if(1 >= 0 && 1 > 1) -> if(False) -> no error thrown
// run-and-fail-again
if(max_retries >= 0 && retries++ > max_retries) -> if(1 >= 0 && 2 > 1) -> if(True) -> error

So you're right. This does run once too often. I'll fix.

Timple commented 3 years ago

So the original implementation seemed correct. That was a thinking error from our side. However the planner seems to have different logic from the controller. So that seems off.

Our problem initially was that our local controller was asked to preempt. And once done preempting it returned CANCELLED. Which is >10 and no retries where allowed. Thus mbf aborted on MAX_RETRIES while it should have returned preempted.

Might be related to a threading? Since the first statement of the while is to check for a cancel_. But after that check and before the max_retries check the controller was asked to cancel it seems...

(Would you like me to open a new issue for this or can we keep discussing here?)

Timple commented 3 years ago

Which will only return MAX_RETRIES if they aren't set to zero.

This should actually fix our issue. So this PR should cover it.