magazino / move_base_flex

Move Base Flex: a backwards-compatible replacement for move_base
https://uos.github.io/mbf_docs/
BSD 3-Clause "New" or "Revised" License
424 stars 154 forks source link

Fix multiple retries despite planner_max_retries=0 (#287) #288

Closed ChristofDubs closed 2 years ago

ChristofDubs commented 2 years ago

fix for #287

As described in the issue, if planner_max_retries is set to 0 and makePlan() fails before the patience is exceeded, the state should be set to NO_PLAN_FOUND immediately, and no retries should occur (just like in the case where planner_patience is set to 0).

I also added a test for this fix (and they fail without the fix), and took the opportunity to add a few other cases that didn't have tests yet.

dorezyuk commented 2 years ago

Thanks for the tests and the pr.

I think the base logic we want to implement here is

Patience Retries Effect On Failure
Enabled Enabled Both PAT_EXCEEDED/MAX_RETRIES
Disabled Disabled No retries NO_PLAN_FOUND
Enabled Disabled Patience PAT_EXCEEDED
Disabled Enabled Retries MAX_RETRIES

The only question is if we say the "retry logic" is disabled at max_retries < 0 or max_retries == 0. Both is fine for me, but I thought it might be confusing if we disable the "patience logic" at zero and the "retry logic" at max_retries < 0. On one hand the max_retries < 0 would be more consistent with move_base. On the other hand - move_base does not support the disabling of the "patience logic" - so they don't have this inconsistency... But if you guys feel its more intuitive to disable the "retry logic" at max_retries < 0, I'm fine with that.

I don't entirely understand why we should return NO_PLAN_FOUND if we reach out max-retries (if its set to zero). For me this seems rather counter-intuitive.

The biggest problem with this pr is that it creates a dead lock for the second row in the table above. If we say "retry logic" is disabled at max_retries < 0 and if the also disable the patience (patience.isZero == True), the code will never exit the loop

ChristofDubs commented 2 years ago

The only question is if we say the "retry logic" is disabled at max_retries < 0 or max_retries == 0. Both is fine for me, but I thought it might be confusing if we disable the "patience logic" at zero and the "retry logic" at max_retries < 0. On one hand the max_retries < 0 would be more consistent with move_base. On the other hand - move_base does not support the disabling of the "patience logic" - so they don't have this inconsistency... But if you guys feel its more intuitive to disable the "retry logic" at max_retries < 0, I'm fine with that.

I'd be in favor of infinite attempts for the max_retries < 0 case:

It seems that the parameter name max_retries is what is causing the inconsistency to begin with: If it was instead called max_attempts, then max_attempts = 1 would mean no retries (max_retries == 0) and max_attempts = 0 would be infinite retries (max_retries = -1); this would then be consistent with patience disabling.

I don't entirely understand why we should return NO_PLAN_FOUND if we reach out max-retries (if its set to zero). For me this seems rather counter-intuitive.

It just returns NO_PLAN_FOUND if it failed and it is not allowed to do any retries (which it already did before this PR); I was assuming that MAX_RETRIES should not be returned since it didn't do any retries:

Patience Retries Effect On Failure
Enabled > 0 Retries + Patience PAT_EXCEEDED / MAX_RETRIES
Disabled > 0 Retries MAX_RETRIES
Enabled 0 1 attempt + Patience PAT_EXCEEDED / NO_PLAN_FOUND
Disabled 0 1 attempt NO_PLAN_FOUND
Enabled < 0 Patience PAT_EXCEEDED
Disabled < 0 infinite retries -

My personal opinion is that the two NO_PLAN_FOUND and MAX_RETRIES return codes are a redundant to begin with; I just based my changes on what was already there to not break any existing code.

Assuming we used NO_PLAN_FOUND regardless of how many retries were configured, then above table would simplify to:

Patience Retries Effect On Failure
Enabled >= 0 finite attempts + Patience PAT_EXCEEDED / NO_PLAN_FOUND
Disabled >= 0 finite attempts NO_PLAN_FOUND
Enabled < 0 Patience PAT_EXCEEDED
Disabled < 0 infinite attempts -

The biggest problem with this pr is that it creates a dead lock for the second row in the table above. If we say "retry logic" is disabled at max_retries < 0 and if the also disable the patience (patience.isZero == True), the code will never exit the loop

Setting max_retries < 0 and patience == 0 would already have this effect without this PR? So it's not a new problem that was introduced.

I also think this is a valid configuration, and exactly the behavior one would expect by giving infinite patience and infinite retires: It means the planner should never give up. I wouldn't recommend such a configuration, but it might be what the user wants if there are no alternative goals to target and no recovery behaviors. Or, the user has his own logic of when to cancel and re-send the action.

dorezyuk commented 2 years ago

Ah ok, then the misunderstanding I had was that I considered the "infinite retries" as a bug.. So the question we should answer is if we want to allow something like a infinite retry for the user... @corot @spuetz What do you think?

corot commented 2 years ago

Ah ok, then the misunderstanding I had was that I considered the "infinite retries" as a bug.. So the question we should answer is if we want to allow something like a infinite retry for the user... @corot @spuetz What do you think?

Inf retries is a valid option; :+1: for merging this as it is (and do the same for exe_path, see my comment above)

spuetz commented 2 years ago

Ah ok, then the misunderstanding I had was that I considered the "infinite retries" as a bug.. So the question we should answer is if we want to allow something like a infinite retry for the user... @corot @spuetz What do you think?

Yes, inf retries as an option makes sense... as the action and by that the execution can be canceled.

corot commented 2 years ago

Yes, inf retries as an option makes sense... as the action and by that the execution can be canceled.

So we all agree with this solution? if so,,, @spuetz, @dorezyuk, please approve