magazino / move_base_flex

Move Base Flex: a backwards-compatible replacement for move_base
https://uos.github.io/mbf_docs/
BSD 3-Clause "New" or "Revised" License
434 stars 154 forks source link

multiple planner retries despite setting "planner_max_retries" to 0 #287

Closed ChristofDubs closed 2 years ago

ChristofDubs commented 2 years ago

It seems that this pull request has introduced a bug:

When setting planner_max_retries to 0 and planner_patience to >0, in case makePlan() fails without exceeding the planner_patience, the last else case is reached:

https://github.com/magazino/move_base_flex/blob/da5d1e45acce55b1d92c672beec057dc44eda229/mbf_abstract_nav/src/abstract_planner_execution.cpp#L352-L355

so makePlan() will keep getting called. @dorezyuk actually pointed this out himself in one of his comments.


In terms of fixing this: It seems to me that for the check here

https://github.com/magazino/move_base_flex/blob/da5d1e45acce55b1d92c672beec057dc44eda229/mbf_abstract_nav/src/abstract_planner_execution.cpp#L332-L336

the initial version was correct, where it was max_retries_ >= 0 instead?

ChristofDubs commented 2 years ago

Alternative fix:

There is a test that checks that for planner_max_retries = 0 and planner_patience = 0 case, the return code is NO_PLAN_FOUND if makePlan() returns an error.

If that is the desired behavior, maybe we could instead consider removing the && patience_.isZero() part here?

https://github.com/magazino/move_base_flex/blob/da5d1e45acce55b1d92c672beec057dc44eda229/mbf_abstract_nav/src/abstract_planner_execution.cpp#L347-L351

Doing so still makes the current tests pass.

dorezyuk commented 2 years ago

Thx for reporting the issue.

If I set the patience to e.x. 1 sec (and max_retries to zero) the mbf tries to generate a path for 1 sec and fails after the patience elapses. To which value did you set the patience to get the infinite loop?

Maybe there is also a misunderstanding what should happen if max_retries is set to zero. My idea was to allow the user either to specify a number of retries - or if the number of retries is not interesting to allow him to disable the retries counting and just focus on the time spent in planning (or to specify both, max_retries and patience).

ChristofDubs commented 2 years ago

Thx for reporting the issue.

If I set the patience to e.x. 1 sec (and max_retries to zero) the mbf tries to generate a path for 1 sec and fails after the patience elapses. To which value did you set the patience to get the infinite loop?

Thank you for the swift reply. Yeah so by "infinite retries", I meant it will keep retrying and only stop once the patience runs out. Sorry for using a bit inaccurate terminology; I fixed the issue title to "multiple retries".

Maybe there is also a misunderstanding what should happen if max_retries is set to zero. My idea was to allow the user either to specify a number of retries - or if the number of retries is not interesting to allow him to disable the retries counting and just focus on the time spent in planning (or to specify both, max_retries and patience).

This behavior you are describing already existed before your PR by setting max_retries to a negative number. And intuitively, setting max_retries to zero should mean it tries to make a plan once, and if it fails, no (=0) retries?

Basically, this test should pass even with patience set to 1:

https://github.com/magazino/move_base_flex/blob/da5d1e45acce55b1d92c672beec057dc44eda229/mbf_abstract_nav/test/abstract_planner_execution.cpp#L114-L135

which it would with the alternative fix I proposed above.

ChristofDubs commented 2 years ago

Based on what's already there, the desired behavior in case the makePlan() keeps failing seems to be:

  1. max_retries < 0, patience > 0: keep retrying until the patience runs out, return PAT_EXCEEDED
  2. max_retries < 0, patience = 0: keep retrying forever
  3. max_retries = 0, patience = 0 or large: don' retry, and return NO_PLAN_FOUND
  4. max_retries > 0, patience = 0 or large: keep retrying until max_retries is reached, return MAX_RETRIES

other cases:

  1. patience smaller than it takes to reach max_retries > 0: interrupt and return PAT_EXCEEDED
  2. patience smaller than it takes to reach max_retries = 0: interrupt and return PAT_EXCEEDED

There are already tests for 3. with zero patience (test name: no_plan_found), 4. with large patience (test name: max_retries) and 6. (test name; patience_exceeded).

If these are the desired behaviors, I can the make the "Alternative fix" from above and write additional tests. There should for sure be a test for case 3. with large patience (i.e. what this whole Issue# 287 is all about), and there should also be one for 1. (the disable retry counting behavior). 5. is covered sufficiently by 6., and for 4., the patience value doesn't really matter. I guess I could also add a test for 2., even though that's a bit a strange configuration.

Thoughts?

corot commented 2 years ago

Based on what's already there, the desired behavior in case the makePlan() keeps failing seems to be:

Yes, I think that's the intended behavior; is a bit convoluted, but afair we where merely replicating old, good move_base behavior. @ChristofDubs explanation makes it more clear, so let's be sure that we incorporate to the documentation.

So reverting this change should be enough. The alternative fix should also work, but to me is a bit more obscure.

corot commented 2 years ago

Thx for reporting the issue.

If I set the patience to e.x. 1 sec (and max_retries to zero) the mbf tries to generate a path for 1 sec and fails after the patience elapses. To which value did you set the patience to get the infinite loop?

Maybe there is also a misunderstanding what should happen if max_retries is set to zero. My idea was to allow the user either to specify a number of retries - or if the number of retries is not interesting to allow him to disable the retries counting and just focus on the time spent in planning (or to specify both, max_retries and patience).

You can achieve your idea by simply setting max_retries to -1, which makes the code effectively ignore the retry concept and fallback to patience. Apart from that, that breaks old move_base behavior, afair

ChristofDubs commented 2 years ago

@corot as per our discussion, reverting the >= check would mean that the return code would change to MAX_RETRIES instead of NO_PLAN_FOUND, hence in the PR for the fix, I went with the alternate fix proposed.

ChristofDubs commented 2 years ago

The corresponding fixing pull request was merged. Hence, closing this issue.

Thank you for your cooperation.