strands-project / strands_movebase

A repository for all the STRANDS-augmented movebase, including 3D obstacle avoidance, etc.
MIT License
10 stars 20 forks source link

Making movebase work #62

Open marc-hanheide opened 8 years ago

marc-hanheide commented 8 years ago

team members @cburbridge @bfalacerda @cdondrup @Pandoro @Jailander @denisehe @TobKoer @PDuckworth

Identified issues

  1. global planner is updating too slowly (tried 10Hz)
  2. the local planner's parameterisation is off and uses many "guessed" values
  3. we need a test definition (to have some "genetic" optimisation for the parameters)
  4. observed problems
    1. turning towards obstacles
    2. stopping too late (maybe acceleration are just wrong)
    3. zig-zag paths
    4. making physical contact
    5. the robot always tries to go to the exact waypoint (should already be relaxed with the newer version)
    6. local cost map probably too small
    7. global cost map inflation could be wrong: We don't want the robot to be always in the middle of a corridor, so we don't want any gradual inflation.
    8. if the robot cannot reach a waypoint (closed door), it stays in front of the door blocking it (might be solved better on topological / execution level)
    9. robot is quite "pushy" navigating through crowds (this should already be working with human-aware nav)

      Test cases

      • general setup: L-shape, with 3 topo map with 3 waypoints (one intermediate)
      • running via topo nav
      • collect rosbags
      • the obstacles are always "around the corner", 1m away from intersection

        static tests

  5. place robot close to obstacle (10cm away, 45, 90, 135), goals right behind, and at 45 degree angle
  6. traverse narrow corridor (1m, corridor in global map)
  7. traverse corridor with row of chairs (2m wide, leaving a 1m gap) a. all chairs on one side b. chairs on both sides
  8. robot trapped
  9. (optional: traverse a door)
  10. blocking the exact intermediate waypoint (putting a chair there)
  11. blocking the target waypoint (putting a wheelchair there), stopping at appropriate distance
  12. chairs blocking the whole route
  13. human standing next to wall in 2m corridor, with crutches sticking out, creating a gap of about 1m

    dynamic tests

  14. emergency stop: "jump" in front of robot (50 cm distance)
  15. pass-crossing: (person walks / wheelchair drives) in front to the robot coming from the side: robot should stop
  16. pass-by: (person walks / wheelchair drives) in front to the robot coming from the side: robot should stop
  17. pass-crossing and pass-by with a group of people (group blocks the way for the robot), with a 10 second break of the group standing in front of robot

    Thinks to do

    • [x] collect all nav params used across sites and comment your observation / characteristic (please comment on this issue)
    • [ ] look at cost maps in the mon nav from deployment data and identify the situations where it failed
    • [ ] implement above test cases and iterate parameters on all sites

To prevent regressions:

marc-hanheide commented 8 years ago

OK, so this is the brief summary of what we planned to do. No it's time to get going on this. Can ask representatives of the sites (maybe @bfalacerda @Pandoro @Jailander @PDuckworth and @nilsbore) to check the parameters they are using, post them (i.e. link them) from here and write one paragraph summarising your observations.

Then, we should create the setup for the test cases, e.g. have a shared topological map which we only slightly adapt to our own environments, but have the names consistent. I suggest, UOL (@jailander, myself, @cdondrup) create a first prototype of the test setup.

Then we should have a hangout to discuss the next steps. Here's a doodle I ask you all to fill to arrange this.

Many thanks

marc-hanheide commented 8 years ago

forgot @hawesie... sorry

bfalacerda commented 8 years ago

I forgot we can't check what params we were using on Bob during the deployment because his hard drive died. I'm pretty sure we were using the current strands_movebase ones though.

However, @kunzel did some changes to the dwa params to make it follow the global path more closely. It was only added to scitos_2d_nav, because it makes navigation in simulation work much better. The differences are

cost =
  path_distance_bias * (distance to path from the endpoint of the trajectory in meters)
  + goal_distance_bias * (distance to local goal from the endpoint of the trajectory in meters)
  + occdist_scale * (maximum obstacle cost along the trajectory in obstacle cost (0-254))

Maybe @cdondrup has some insight on these params, since he has been looking at dwa?

cdondrup commented 8 years ago

Actually setting path_distance_bias to 1 and goal_distance_bias to 0.5 should have had the opposite effect and allow the robot to diverge from the global path a bit more. However, if the DWA only uses the params described above (I remember there was goal align and path align somewhere as well but that might have been in a different implementations of it) then this shouldn't have changed much. The ratio is key here so with 32 and 24 it should have been the same as 1.0 and 0.75... so by changing it to 1.0 and 0.5 we only just gave less weight to the goal_distance_bias.

The occdist_scale is the one responsible for obstacle avoidance and the default here should be 0.01 which means it only takes lethal obstacles into account and doesn't have effect on the other two params.

bfalacerda commented 8 years ago

yes, from what i understand, the short sim time is what makes a big difference here: since both those values only look at the endpoint of the simulated trajectories, if we increase their size we increase the number of trajectories that don't go along the global plan, but have an endpoint close to it

cdondrup commented 8 years ago

Isn't that what you want for obstacle avoidance?

bfalacerda commented 8 years ago

depends. if we increase the rate of the global planner it might be better to keep the sim time smaller? with this version we had the robot rotating a lot more in place, at least in simulation.

Pandoro commented 8 years ago

We are indeed using some custom parameters. We had our old hiwi run around and optimize the parameters in such a way that the behaviour would be 'optimal' and we have the following in our "site parameter"-file:

local_costmap:
  width: 4
  height: 4
  resolution: 0.02

  # Improves navigation in tight corridors tremendously.
  inflation_layer:
    inflation_radius: 0.5

DWAPlannerROS:
  max_vel_y: 0.0
  min_vel_y: 0.0

I can't really say what the idea was behind these changes, just that it felt a bit more stable after he played around with the parameters. In general navigation seems okayish, apart from of course the driving towards obstacles and the slightly awkward behaviour in our kitchen that I described during the GA. According to our launch file we are supposedly running the human aware navigation, but he doesn't stare at people so maybe something went going wrong there. I'll check more accurately on our next run with Karl.

marc-hanheide commented 8 years ago

Could you please submit a link to the "whole" set of files, as this already assumes there would be a standard you deviate from ;-)

Pandoro commented 8 years ago

You can check out the strands karl repo. This should have everything we use, including the above file. We took most of the stuff from what we had before in the marathon system and updated it with everything new and relevant from the review system.

marc-hanheide commented 8 years ago

@Pandoro @Jailander @cdondrup Can I ask you to fill the doodle poll at https://doodle.com/poll/56f7c5ydtqny8p78 as well, please?

marc-hanheide commented 8 years ago

For the record, I link the current versions of params here:

BHAM

RWTH

UOL

missing

KTH

marc-hanheide commented 8 years ago

@bfalacerda do you remember why we move to our own NavFn? https://github.com/strands-project/strands_movebase/commit/561836295f0418981240e0b1eb5db40f90d7c0ba

bfalacerda commented 8 years ago

yes, so we could reconfigure the default_tolerance param a runtime. This makes the robot not go to the exact pose of the intermediary waypoints.

marc-hanheide commented 8 years ago

right, I remember...

nilsbore commented 8 years ago

A git diff to the branch we are using reveals no changes to the parameters.

Jailander commented 8 years ago

@marc-hanheide in Linda we use the standard parameters namely the ones at https://github.com/strands-project/strands_movebase/tree/indigo-devel/strands_movebase/strands_movebase_params

marc-hanheide commented 8 years ago

simulation env in https://github.com/strands-project/strands_morse/pull/138

nilsbore commented 8 years ago

http://answers.ros.org/question/201535/move_basedwa-parameters-for-a-large-robot-with-rotation-center-in-the-front/

Then I thought well, I am going to read the source code, which helped a lot. and thanks for good coding + comments. In dwa_local_planner page, the forward_point_distance is defined as an additional scoring point in front of the robot. I frankly didn't understand what it meant. However, in the source code, I found out that this value is being used as a cost value for robot alignment to the path. By setting it to zero, the trajectory will not be penalized if the robot is not aligned on the path. This is a very important factor for differential drive robots whose centre of rotation is not the in the middle of the robot. Apparently, a lot of valid trajectories were being dropped because of this. As soon as I set this to zero, the robot was able to perform much better. However, improvements could be done so I continued.

This makes me want to investigate how the robot behaves if we set forward_point_distance: 0.0. The point afterwards about visualizing the evaluated trajectories also sounds like a good point.

marc-hanheide commented 8 years ago

very useful @nilsbore !

Quick summary of today's hangout:

  1. @Jailander to define the first test case as part of strands_navigation unit test as a blueprint
  2. @bfalacerda to talk to @cburbridge to fork https://github.com/ros-planning/navigation into STRANDS including Chris' visualisation of policies (and release it from there)
  3. @cdondrup to work on the testing framework, including recording of trajectories. It has been discussed that we split the test into a test preparation service which will differ in simulation and real, and an actual unit test that call the preparation service (e.g. to "beam" the robot to its starting position), and then run the actual test.
  4. Everybody to replicate first test case "live" once working
  5. Everybody to add additional unit tests for the other cases identified above (incrementally, always in line with live robot testing)
  6. @marc-hanheide to facilitate simulation test on Jenkins: https://github.com/strands-project/strands_ci/issues/15
  7. @creuther to check again for relation between force and DWA parameters
cburbridge commented 8 years ago

I have forked navigation into strands-project and added my debug out changes to the "strands-testing" branch. These changes are not permanent as they make the code less efficient, so this branch should only be used for debugging etc. Therefore I don't think a "release" would be a good idea.

To use it, download and build the repo in you workspace then restart navigation, and run the (ugly) python script dwa_planner/scripts/log.py. This will open a window with six images, each showing the score for the critics used in the code. The axis for each graph are: y=angular velocity, x=linear velocity, centre is zero and the values are scaled by 10. so (x=10,y=30) represents (lv=0, wv=0), (x=15,y=20) represents (lv=0.5, wv=-0.1). The graphs are updated in real time as the robot drives, but some timesteps will be missed due to the speed of the redraw. If this is a problem then you can log the topic debug topic and replay it slower...

marc-hanheide commented 8 years ago

@cburbridge agreed, not to be released. I just thought we'll eventually fix code in navigation and then we'd have our own. But for now, let's leave it as a debug tool.

marc-hanheide commented 8 years ago

People will (probably) like to hear that Jenkins can now run simulation-enabled test (more precisely, it can run test in a virtual OpenGL/X environment, hence allowing us to write unit test that use the simulator and/or other GUIs. https://github.com/strands-project/strands_navigation/pull/270 is a first example (though a stupid one as it only starts the simulator but doesn't do anything with it, but it's a test). One outcome of any tests we now run on jenkins (devel or PR) is now a highly compressed (1 fps) video (see https://lcas.lincoln.ac.uk/jenkins/view/Pull%20Requests/job/pr-indigo-strands_navigation/98/artifact/Xvnc.mp4 for the corresponding example) showing the virtual desktop (jump to minute 6 in said video to see the simulator appearing). This video should help to identify causes of failure in unit tests. As this is facilitated by a virtualised compute server GPU, you cannot expect the performance of a real GPU here, but it's enough to run morse in wireframe (and maybe also small/simple environments).

Having written all this, I should probably write the same to the mailing list...

Anyway, so the infrastructure is there to actually write more elaborate tests... now somebody needs to do it ;-)

creuther commented 8 years ago

I talked to my people and this is the (probably not as satisfying as hoped for) gist of it:

1. The SCITOSDriver specifies a whole lot of maximum/minimum velocities and maximum positive/negative accelerations. These values are passed down to the MCU (Main Control Unit) and will be used as soft limits in the motor controller. Most of the time these constraints are met, but apparently it cannot be guaranteed (only in extreme situations).

This means that the "drive parameters" of all STRANDS robots are currently specified here. More specifically, these parameters govern the way soft limits:

        <MaxForwardVelocity>1.0</MaxForwardVelocity>
        <MaxBackwardVelocity>-0.4</MaxBackwardVelocity>
        <MaxRotVelocity>120</MaxRotVelocity>
        <TransEpsilonVelocity>0.02</TransEpsilonVelocity>
        <RotEpsilonVelocity>1</RotEpsilonVelocity>
        <MaxTransAcceleration>0.8</MaxTransAcceleration>
        <MaxTransDeceleration>0.9</MaxTransDeceleration>
        <MaxRotAcceleration>180</MaxRotAcceleration>
        <MaxRotDeceleration>260</MaxRotDeceleration>
        <MaxEmergencyDeceleration>1.6</MaxEmergencyDeceleration>

Rotations given in degrees / {second,second^2} and translations given in meters / {second, second^2}.

The changes @bfalacerda made with respect to the DWA parameters should reflect those parameters. The parameters can also be tweaked to achieve softer acceleration, more abrupt braking, etc.

2. (paraphrasing Stefan, as I don't know a lot about electronics): The force basically imposes a threshold for the motor current, which is encoded as a 8 bit PWM (hence the values 0-255 cover the complete analogue spectrum). From a motor controlling perspective it would be ideal to always drive at full force, but due to wheel slippage and things like relying on the robot to get stopped by the charging station's threshold, we compromise and use a value of typically around 80-120. Therefore lowering the force too much will increase the stopping distance. Also, with a force of around 100 the acceleration/deceleration limits set (see above) aren't necessarily reached, so increasing the force will enable the robot to get closer to the set limits if they are "quite high".

There is no real formula or table I can give you regarding the force and the accelerations. But setting lower values for accelerations should mitigate the influence of changing the force on the acceleration/deceleration behaviour.

marc-hanheide commented 8 years ago

given the values are defined in the Mira config file would it make sense to expose them as rosparams and have DWA configured from those params? obviously we would expose them as rad/s(^2) where needed?

creuther commented 8 years ago

Certainly possible, they can be queried in MIRA and therefore exposed as rosparams. Though note that they are not dynamically reconfigurable, they are written to the MCU at startup and then can't be changed during runtime. But still, it would probably beat having to maintain two sets of files (robot config file and dwa params file) and all the potential trouble that comes with that. I can get together with @cburbridge to help expose the properties.

cburbridge commented 8 years ago

I have done some test in the scitos_mira code, and I think there is a ~400ms delay between commanding a velocity and the odometry reporting that the robot is moving at that speed. This is just the delay on the MIRA side, it does not include any delay between the DWA selecting a velocity and the MIRA code sending that velocity to the firmware. @creuther would you expect this kind of timing?

nilsbore commented 8 years ago

I made some improvements to how the clouds are processed by movebase, which should gains us some more CPU on the main computer: https://github.com/strands-project/strands_movebase/pull/64 .

marc-hanheide commented 8 years ago

In response to https://github.com/strands-project/strands_movebase/issues/62#issuecomment-152651469: @creuther can you comment? A delay of almost half a second can indeed explain a lot of stuttering I guess. Is there a way to reduce this or is there a way to make DWA aware of such a delay (asking very silly questions)...

creuther commented 8 years ago

This seems very peculiar to me. I wanted to do a test on Betty but no one's here yet so I'm hesitant to stop the running ROS stuff.

From our side, we expect some delay in setting and passing the command to the motors ( as always, giving an exact number is not possible, unfortunately... :( ) but certainly not 400ms. I'll have a look and report back as soon as someone turns up at the office ;)

hawesie commented 8 years ago

Tsk, lazy Brummies.

cburbridge commented 8 years ago

I'm away this week, but will look at it again on Monday.

creuther commented 8 years ago

Small update: Commanding a velocity from MIRA directly doesn't result in any significant delay, while the delay is definitely there if you command the velocity from ROS. I had a look through the scitos_mira code but can't really see any obvious points of concern.

The only thing that I am not too sure about is the following: Here we define that ScitosG5, i.e. pretty much the entire ROS-MIRA-bridge, should spin at 5Hz. Does the following sleep mean that the entire ROS-MIRA-bridge will be blocked for the remaining (200 - x) ms? I tried to set it to spin at 25Hz as well as 1Hz on Betty but didn't really make a difference. Then again @cburbridge 's debug timestamps show that the odometry is definitely received with 20Hz, so it really doesn't make a difference / block it as far as I can tell. Strange.

cburbridge commented 8 years ago

The 5Hz spin loop is a separate thread to the odometry message receiving, so having this at 5Hz should be ok I think. However, there used to be horrible bug in this respect a while back: https://github.com/strands-project/scitos_drivers/pull/53 so this needs more checking. Since the only component that registers functions to be called in that loop is the head, it would be interesting to test without that module. If you have time could you try running the scitos launch again, but remove Head from https://github.com/strands-project/scitos_drivers/blob/indigo-devel/scitos_mira/launch/scitos_mira.launch#L4. If not no problem, I'll test it out on Monday.

cdondrup commented 8 years ago

Since the tests are almost finished, I have a few questions:

Recovery Behaviours: Which recovery behaviours, if any, do we want during the test runs? Especially things like backtrack make the whole test if DWA can turn when it faces a wall kind of pointless as the robot will backtrack and then turn. Since we want to tune DWA parameters, I think we have to be careful which recoveries to run. This brings me to another question: Why are there two different locations where the recovery behaviours are defined @bfalacerda ? I.e. in the config file and a python file.

Results: What feedback would like to get in addition to that it failed or succeeded. Currently, the service starting the test run will report if navigation to the goal was successful, if not, if the robot was able do die gracefully by at least being able to return to the starting location, the travelled distance and time. Once I took care of the human, I will also add the minimum distance kept to the human.

Timeouts: Currently, the navigation times out after 40 seconds and then tries to go back to the starting location for another 40 seconds. Meaning that if the goal is not reached in that time it will take the same time to try and go back. In total that means each test has a maximum of 80 seconds to at least die gracefully. Do we want a timeout at all or do we want to rely on navigation failing on its own? This might mean that the tests will run for quite a while or we have to reconfigure the sleep and retry argument (if we want that recovery behaviour at all) to make it fail quicker.

Maybe @marc-hanheide and/or @hawesie have an opinion on those.

cdondrup commented 8 years ago

Just briefly talked to @marc-hanheide and this is what we thought would be best:

Recovery Behaviours: None. This is not what we want to test here. Follow-up question for @bfalacerda by not giving the mon nav a config file it does not run the stuff in the config file obviously. For the ones in the python file I would have to disable them via the parameter?! Or can I disable them all together from the start. Maybe not even running monitored navigation for now?

Timeouts: No time outs. Navigation should fail on its own.

Any other thoughts on that?

Jailander commented 8 years ago

The config file tells which recovery behaviours python file to load, you need to run monitored navigation, but if you don't give any config file it will simply not load any (which is what we want in this case).

bfalacerda commented 8 years ago

I agree with @marc-hanheide, and @Jailander already answered the question about the config file :) You should also see a warning being printed telling you that there are no recovery behaviours added.

Regarding the config vs python, config defines the RecoverStateMachine to load. The SM itself is defined as a python class, since it's using smach. We need the python code to define how the SM moves between states.

cdondrup commented 8 years ago

We discovered a different problem regarding the timeout that I guess @bfalacerda solved somewhere in monitored nav that I can't find. Navigation never fails. Somewhere there has to be a timeout. Can you point me to that @bfalacerda ?

bfalacerda commented 8 years ago

I don't get the question. What do you mean navigation never fails?

There are no timeouts in monitored nav, it waits for move_base (or whatever action it calls) to give a result

Jailander commented 8 years ago

I think the question, is how to set a timeout in monitored navigation because move_base doesn't fail by itself

bfalacerda commented 8 years ago

it does, dwa eventually gives up. you can also set the planner_patience and controller_patience params: http://wiki.ros.org/move_base?distro=jade#Parameters

cdondrup commented 8 years ago

The problem is that if the robot starts in front of a wall (10cm distance) it just prints:

[ WARN] [1446723231.414625106]: DWA planner failed to produce path.
[ INFO] [1446723231.513456115]: Got new plan

until the end of times. The problem is that the robot is not inside an obstacle so navigation does not fail but is also not able to find a path. I let it run for a few minutes but it didn't give up. I'll have a look the link. Thanks!

cdondrup commented 8 years ago

OK, I don't think this will help much:

~planner_patience (double, default: 5.0)
How long the planner will wait in seconds in an attempt to find a valid plan before space-clearing operations are performed.
~controller_patience (double, default: 15.0)
How long the controller will wait in seconds without receiving a valid control before space-clearing operations are performed.

It does perform the clearing but then simply keeps on trying to no avail.

bfalacerda commented 8 years ago

are you sure it's not failing and being called over and over again? Are you using topo nav policy execution? I think it tries several times. monitored nav will always report back to topo after a failure so it can decide what to do

bfalacerda commented 8 years ago

the behaviour that move_base usually has after dwa failing for controller_patience seconds is clearing the costmaps and trying again. If valid controls are still not found, it should output aborted. I've never seen it getting stuck with dwa failing forever

cdondrup commented 8 years ago

Just to see if it does anything I ran it for 5 minutes (8 minutes now after finishing to write this) and the only thing it outputs is that DWA is unable to produce a path. This is the same when the robot, e.g. is surrounded by people, technically, there is nothing wrong because the robot is in free space and not inside an obstacle but the DWA can still not find a path. Hence, navigation will never fail and mon nav will never know.

It did clear the costmaps but then keeps on running till forever. I'm using the default move_base params:

cdondrup@calculon:~/ros_ws
$ rosparam get /move_base/
DWAPlannerROS: {acc_lim_th: 2.0, acc_lim_theta: 3.2, acc_lim_x: 1.0, acc_lim_y: 0.0,
  acc_limit_trans: 0.1, angular_sim_granularity: 0.1, forward_point_distance: 0.325,
  goal_distance_bias: 24.0, holonomic_robot: false, latch_xy_goal_tolerance: true,
  max_rot_vel: 1.0, max_scaling_factor: 0.2, max_trans_vel: 0.55, max_vel_x: 0.55,
  max_vel_y: 0.0, min_rot_vel: 0.4, min_trans_vel: 0.1, min_vel_x: 0.0, min_vel_y: 0.0,
  occdist_scale: 0.01, oscillation_reset_angle: 0.2, oscillation_reset_dist: 0.05,
  path_distance_bias: 32.0, prune_plan: true, restore_defaults: false, rot_stopped_vel: 0.1,
  scaling_speed: 0.25, sim_granularity: 0.025, sim_time: 2.1, stop_time_buffer: 0.2,
  trans_stopped_vel: 0.1, use_dwa: true, vth_samples: 20, vx_samples: 3, vy_samples: 10,
  xy_goal_tolerance: 0.3, yaw_goal_tolerance: 0.1}
NavfnROS: {allow_unknown: false}
aggressive_reset:
  layer_names: [obstacle_layer]
  reset_distance: 0.0
base_global_planner: navfn/NavfnROS
base_local_planner: dwa_local_planner/DWAPlannerROS
clearing_rotation_allowed: true
conservative_reset:
  layer_names: [obstacle_layer]
  reset_distance: 2.0
conservative_reset_dist: 3.0
controller_frequency: 10.0
controller_patience: 15.0
global_costmap:
  footprint: '[[-0.3,-0.32],[-0.52,-0.11],[-0.52,0.11],[-0.3,0.32],[-0.05,0.31],[0.05,0.25],[0.15,0.13],[0.2,0],[0.15,-0.13],[0.05,-0.25],[-0.05,-0.31]]'
  footprint_padding: 0.0001
  global_frame: /map
  height: 10
  inflation_layer: {cost_scaling_factor: 5.0, enabled: true, inflation_radius: 0.8}
  map_layer: {enabled: true, map_topic: /map}
  no_go_layer: {enabled: true, map_topic: /no_go_map}
  obstacle_layer:
    combination_method: 1
    enabled: true
    laser: {clearing: true, data_type: LaserScan, marking: true, sensor_frame: base_laser_link,
      topic: scan}
    mark_threshold: 0
    max_obstacle_height: 2.0
    observation_sources: laser
    obstacle_range: 2.5
    origin_z: 0.0
    raytrace_range: 3.0
    track_unknown_space: true
    unknown_threshold: 6
    z_resolution: 0.2
    z_voxels: 6
  obstacle_layer_footprint: {enabled: true}
  origin_x: 0.0
  origin_y: 0.0
  plugins:
  - {name: map_layer, type: 'costmap_2d::StaticLayer'}
  - {name: no_go_layer, type: 'costmap_2d::StaticLayer'}
  - {name: obstacle_layer, type: 'costmap_2d::VoxelLayer'}
  - {name: inflation_layer, type: 'costmap_2d::InflationLayer'}
  publish_frequency: 0.5
  resolution: 0.05
  robot_base_frame: /base_link
  robot_radius: 0.46
  rolling_window: false
  static_map: true
  transform_tolerance: 0.3
  update_frequency: 5.0
  width: 10
local_costmap:
  footprint: '[[-0.3,-0.32],[-0.52,-0.11],[-0.52,0.11],[-0.3,0.32],[-0.05,0.31],[0.05,0.25],[0.15,0.13],[0.2,0],[0.15,-0.13],[0.05,-0.25],[-0.05,-0.31]]'
  footprint_padding: 0.0001
  global_frame: /map
  height: 4
  inflation_layer: {cost_scaling_factor: 5.0, inflation_radius: 0.8}
  map_layer: {map_topic: /map}
  no_go_layer: {map_topic: /no_go_map}
  obstacle_layer:
    combination_method: 1
    enabled: true
    laser: {clearing: true, data_type: LaserScan, marking: true, sensor_frame: base_laser_link,
      topic: scan}
    mark_threshold: 0
    max_obstacle_height: 2.0
    observation_sources: laser
    obstacle_range: 2.5
    origin_z: 0.0
    raytrace_range: 3.0
    track_unknown_space: true
    unknown_threshold: 6
    z_resolution: 0.2
    z_voxels: 6
  obstacle_layer_footprint: {enabled: true}
  origin_x: 0.0
  origin_y: 0.0
  plugins:
  - {name: obstacle_layer, type: 'costmap_2d::VoxelLayer'}
  publish_frequency: 2.0
  resolution: 0.05
  robot_base_frame: /base_link
  robot_radius: 0.46
  rolling_window: true
  static_map: false
  transform_tolerance: 0.3
  update_frequency: 5.0
  width: 4
oscillation_distance: 0.5
oscillation_timeout: 0.0
planner_frequency: 0.5
planner_patience: 5.0
recovery_behavior_enabled: true
recovery_behaviors:
- {name: conservative_reset, type: clear_costmap_recovery/ClearCostmapRecovery}
- {name: aggressive_reset, type: clear_costmap_recovery/ClearCostmapRecovery}
restore_defaults: false
shutdown_costmaps: false

Yes, I am using policy execution from topo nav.

The easiest fix would be to set a timeout for the policy execution client but this feels somehow artificial.

cdondrup commented 8 years ago

Also, the goal is in free space. If the goal would be blocked, this would fail.

bfalacerda commented 8 years ago

I just tried calling move base directly via rviz and i get this:

[ WARN] [1446725936.535797813]: DWA planner failed to produce path.
[ INFO] [1446725936.633590127]: Got new plan
[ WARN] [1446725936.636505293]: DWA planner failed to produce path.
[ INFO] [1446725936.733484535]: Got new plan
[ WARN] [1446725936.735877074]: DWA planner failed to produce path.
[ WARN] [1446725936.833553834]: Clearing costmap to unstuck robot (2.000000m).
[ INFO] [1446725937.033437890]: Got new plan
[ WARN] [1446725937.035099355]: DWA planner failed to produce path.
[ WARN] [1446725937.133271232]: Clearing costmap to unstuck robot (0.000000m).
[ INFO] [1446725937.333269376]: Got new plan
[ WARN] [1446725937.336529415]: DWA planner failed to produce path.
[ERROR] [1446725937.433302700]: Aborting because a valid control could not be found. Even after executing all recovery behaviors

And then it stops. Can you try an rviz goal also? Do you see this error message in your case? If so, it aborted, and then was called again

cdondrup commented 8 years ago

OK, I tested a bit more. This seems to only happen in one of the test cases where the robot after clearing the costmap is bumping back a and forth by ~1cm all the time. Therefore it moves and DWA does not fail. In every other case, I also observe the behaviour you described. It's test map mb_test1 if you want to try.