Making movebase work - Githubissues

marc-hanheide commented 9 years ago

team members @cburbridge @bfalacerda @cdondrup @Pandoro @Jailander @denisehe @TobKoer @PDuckworth

Identified issues

global planner is updating too slowly (tried 10Hz)
the local planner's parameterisation is off and uses many "guessed" values
we need a test definition (to have some "genetic" optimisation for the parameters)
observed problems
1. turning towards obstacles
2. stopping too late (maybe acceleration are just wrong)
3. zig-zag paths
4. making physical contact
5. the robot always tries to go to the exact waypoint (should already be relaxed with the newer version)
6. local cost map probably too small
7. global cost map inflation could be wrong: We don't want the robot to be always in the middle of a corridor, so we don't want any gradual inflation.
8. if the robot cannot reach a waypoint (closed door), it stays in front of the door blocking it (might be solved better on topological / execution level)
9. robot is quite "pushy" navigating through crowds (this should already be working with human-aware nav)
  Test cases
  - general setup: L-shape, with 3 topo map with 3 waypoints (one intermediate)
  - running via topo nav
  - collect rosbags
  - the obstacles are always "around the corner", 1m away from intersection
    static tests
place robot close to obstacle (10cm away, 45, 90, 135), goals right behind, and at 45 degree angle
traverse narrow corridor (1m, corridor in global map)
traverse corridor with row of chairs (2m wide, leaving a 1m gap) a. all chairs on one side b. chairs on both sides
robot trapped
(optional: traverse a door)
blocking the exact intermediate waypoint (putting a chair there)
blocking the target waypoint (putting a wheelchair there), stopping at appropriate distance
chairs blocking the whole route
human standing next to wall in 2m corridor, with crutches sticking out, creating a gap of about 1m
dynamic tests
emergency stop: "jump" in front of robot (50 cm distance)
pass-crossing: (person walks / wheelchair drives) in front to the robot coming from the side: robot should stop
pass-by: (person walks / wheelchair drives) in front to the robot coming from the side: robot should stop
pass-crossing and pass-by with a group of people (group blocks the way for the robot), with a 10 second break of the group standing in front of robot
Thinks to do
- [x] collect all nav params used across sites and comment your observation / characteristic (please comment on this issue)
- [ ] look at cost maps in the mon nav from deployment data and identify the situations where it failed
- [ ] implement above test cases and iterate parameters on all sites

To prevent regressions:

[ ] implement navigation unit tests, based on ROS bag
[x] implement navigation unit tests, based on Morse
Further ideas:
make it more clearly when the robot is about to move
put warning lights on to indicate when it is about to move (complementing speech)
check for RAM/CPU usage
watch dogs need to be always running

cdondrup commented 9 years ago

Another problem, I start mon nav without a config file and get the following:

[WARN] [WallTime: 1446726452.096017] No config yaml file provided. MonitoredNavigation state machine will be initialized without recovery behaviours. To provide a config file do 'rosrun monitored_navigation monitored_nav.py path_to_config_file'. The strands config yaml file is located in monitored_navigation/config/strands.yaml

which I tought means that it would also not do the sleep and retry. Still it keeps on trying after it says aborted:

[ERROR] [1446726337.024717847]: Aborting because a valid control could not be found. Even after executing all recovery behaviors
[INFO] [WallTime: 1446726337.051787] State machine terminating 'NAVIGATION':'planner_failure':'not_recovered_without_help'
[INFO] [WallTime: 1446726337.052387] Concurrent state 'NAV_SM' returned outcome 'not_recovered_without_help' on termination.
[INFO] [WallTime: 1446726337.053695] Concurrent Outcomes: {'NAV_SM': 'not_recovered_without_help'}
[INFO] [WallTime: 1446726337.054235] State machine terminating 'MONITORED_NAV':'not_recovered_without_help':'not_recovered_without_help'
[INFO] [WallTime: 1446726337.054745] ABORTED
[INFO] [WallTime: 1446726337.303836] Intermediate -(move_base)-> End
[INFO] [WallTime: 1446726337.356830] State machine starting in initial state 'MONITORED_NAV' with userdata: 
    ['goal', 'result']
[INFO] [WallTime: 1446726337.357498] Concurrence starting with userdata: 
    ['goal']
[INFO] [WallTime: 1446726337.359362] State machine starting in initial state 'NAVIGATION' with userdata: 
    ['goal', 'n_fails']

So I get the above several times before my timeout of 1 minute kicks in and preempts it.

I am calling it like this:

self.client.send_goal(ExecutePolicyModeGoal(route=self._policy.route))
self.client.wait_for_result(timeout=rospy.Duration(self._timeout))

but this only returns after the timeout is reached and not when it fails for the first time. Is it supposed to retry navigation even if sleep and retry should not be there due to no config file?

bfalacerda commented 9 years ago

i see... maybe that has to do with the very aggressive clearing we have now: https://github.com/strands-project/strands_movebase/blob/indigo-devel/strands_movebase/strands_movebase_params/move_base_params.yaml#L15

This means we're clearing everything the robot cannot see from the costmap, which allows him to move a bit. maybe we should increase this value a bit?

bfalacerda commented 9 years ago

regarding the retrying, that's the policy executor which is configured to retry. You see that mon nav is outputting ABORTED there. there should be a param that you can change, @Jailander ?

cdondrup commented 9 years ago

Ah... redundancy... good...

I'll try the parameter. Thanks!

Jailander commented 9 years ago

hmm there should be one can't remember if not I'll make it ;)

bfalacerda commented 9 years ago

the sleep and retry is there to avoid asking for help too soon. the topo nav retry is there to make sure that after failing (and being pushed around during hellp for example) the robot checks its current waypoint and executes the appropriate action

Jailander commented 9 years ago

oh yeah I'll set this up with @cdondrup

Jailander commented 9 years ago

that parameter existed for topological navigation but not for execute_policy_mode a fix for this is in https://github.com/strands-project/strands_navigation/pull/275

cburbridge commented 9 years ago

@creuther The issue with the delay on the odometry channel or commanding the robot is not a problem with the ROS side, but the MIRA side. I have created a separate issue for it here: https://github.com/strands-project/scitos_drivers/issues/114

cdondrup commented 8 years ago

The navigation tests have been merged into strands_navigation. This has not been released yet, but have a look at the README and check out the current version of the repo to run tests in simulation and on the real robot.

nilsbore commented 8 years ago

Cool! Is it time for another hangout or should we just start running these tests IRL and then report back our findings?

cdondrup commented 8 years ago

You can try the tests first and let me know if you have any questions/suggestions. Still missing are the dynamic tests and also the exact definition of what kind of feedback we want from the test.

In general, I think a hangout would be good.

marc-hanheide commented 8 years ago

Agreed on the hangout. Can you initiate a doodle? Also, I have released strands_navigation again to include the tests.

denisehe commented 8 years ago

hello, well I followed your discussions - but actually all the technical details do not tell me that much. is it possible that you film the tests so that I can see how the robot reacts to different test situations - so I can give you profound feedback if it is okay for the deployment or not :-) Cheers from sunny and extremely hot vienna!

cdondrup commented 8 years ago

I sent an email to everyone who participated in the last hangout but in case someone new wants to join: http://doodle.com/poll/dchmfrzzwsca2qnk

Please fill in the doodle till the end of the week.

cburbridge commented 8 years ago

I have found what I suspect to be a bug in the scitos URDF: https://github.com/strands-project/scitos_common/pull/59. If this is an issue then it will have an effect on the movebase parameters....

cdondrup commented 8 years ago

Regarding the forking of ros navigation into our github mentioned in here, is the current navigation fork already @cburbridge 's version? Is this released already? I am currently changing the DWA to include my velocity costmaps so I would like to be able to contribute to our fork of the navigation framework. Can someone also pull the latest changes from upstream and enable issues on our fork please?

marc-hanheide commented 8 years ago

I enabled issues

marc-hanheide commented 8 years ago

I have also done the following on https://github.com/strands-project/navigation:

merged the upstream of indigo-devel into our fork
I have made indigo-devel the default branch
I have enabled devel and PR build on jenkins for our fork (as for all our other repos)
I have named our fork not "navigation", but "navigation_strands_fork" to avoid conflicts with the offical ones (This is only an internal name, we are not releasing Ubuntu packages from that fork yet, but once we do, we need to think of a name for it, or, better, a version prefix)

cdondrup commented 8 years ago

Thank you @marc-hanheide !

marc-hanheide commented 8 years ago

also changed all maintainers to marc@hanheide.net in all package.xml to make sure we on't harass the original maintainers when our builds fail. Feel free to change it to your name, when you do change things.

marc-hanheide commented 8 years ago

agreed in meeting today:

@bfalacerda to make sure we only use strands_movebase parameters (https://github.com/strands-project/strands_morse/issues/144)
put the correct params as discussed above in strands_movebase
update the footprint
make sure that all those PRs related to this are merged
run the tests as described in https://github.com/strands-project/strands_navigation/blob/indigo-devel/topological_navigation/tests/README.md from a workspace that contains

nilsbore commented 8 years ago

Sorry I wasn't there, had forgotten to put it in my calendar. Will follow up on this.

nilsbore commented 8 years ago

It seems that the moving of the laser and chest camera frames, together with the changing of the footprint, has improved navigation quite a bit in some aspects. What I've seen here is that the robot is traversing doors noticeably better than before.

Overall, when I ran navigation for a few hours the other day it seemed more robust but that is harder to pen down to something specific. Time to run the navigation tests =P.

cdondrup commented 8 years ago

Currently failing tests in simulation due to parameters:

test_static_1m_l_shaped_corridor: After taking the bend the robot comes too close to one of the walls and shows the well known behaviour of just turning towards it and dying.
test_static_70cm_wide_door: Still unsure if this will ever be solvable.
test_static_corridor_blocked_by_humans: Robot plans through groups of humans (even though there is not enough space) and traps itself between them.
test_static_corridor_blocked_by_wheelchairs: Robot plans through one of the wheelchairs because it only sees the wheels. This might be fixable using 3D obstacle avoidance. Nevertheless, there is not enough space for the robot to go between the wheels either.
test_static_facing_wall_10cm_distance_0_degrees_goal_behind: Too close to wall for move base to get free.
test_static_facing_wall_10cm_distance_minus_45_degrees_goal_behind: Too close to wall for move base to get free. Turns to +45 and then dies.
test_static_facing_wall_10cm_distance_plus_45_degrees_goal_behind: Too close to wall for move base to get free. Turns to -45 and then dies.
test_static_human_on_end_point: Not able to go back to start after failing to reach the end point.
test_static_wheelchair_on_end_point: Not able to go back to start after failing to reach the end point.

nilsbore commented 8 years ago

Maybe it's time to switch strands_morse to strands_movebase and possibly add the chest camera to the simulation. The nice thing about the chest camera in simulation is that we can use the point cloud as-is, it won't be as expensive as the computations we do for the head cam to create the color point cloud and the images. What do you think?

bfalacerda commented 8 years ago

if navigation is being launched using strands_navigation.launch, then morse is also using strands_movebase. see https://github.com/strands-project/strands_morse/issues/144

bfalacerda commented 8 years ago

regarding chest cam, we could add it yes, i guess it'll be important for the wheelchairs

Jailander commented 8 years ago

yes that would be brilliant, however I think jenkins can only run simulations in the fast mode without cameras (right @marc-hanheide ?) but we could still run the tests independently using a flag like for the real robot

nilsbore commented 8 years ago

Ok, I was just looking at this launch file: https://github.com/strands-project/strands_morse/blob/indigo-devel/mba/launch/move_base_arena_nav.launch .

nilsbore commented 8 years ago

I'll have a look at adding the chest cam sometime during this week.

marc-hanheide commented 8 years ago

Can we please leave the default without chestcam in simulation. Many students and other people (including jenkins) don't have a suitable GPU and can't run the full Morse. Of course it would be great to add chestcam and a suitable flag.

nilsbore commented 8 years ago

Of course, it will be an option.

cdondrup commented 8 years ago

Once we have a navigation set-up that runs in simulation and on the robot, I can add that to the tests and have a flag as well, yes.

cdondrup commented 8 years ago

We did the first real robot tests based on the test scenarios in simulation. You find a few videos here: https://lcas.lincoln.ac.uk/owncloud/index.php/s/GppNjlMSMyaHTVf#

Short summary or what isn't working:

the robot starting 10cm away from the wall. Doesn't even move regardless of the angle it is facing it at.
the middle waypoint blocked by an obsctacle, robot doesn't even try to move. According to @Jailander that should have been fixed and the robot should try to go as close as possible to that point at least but we didn't observe that.
the last waypoint is blocked. The robot doesn't even try to move towards the last waypoint after reaching the intermediate one. Not sure if this should be changed or is actually a good behaviour.

marc-hanheide commented 8 years ago

Many thanks, @Jailander and @cdondrup for getting going on this. I think it's now time we reconvene on hangout to replicate this at the various sites and to discuss the improvements possible to make it pass all those tests. The latter seem to be mostly related to topological navigation, so @Jailander and @bfalacerda are probably the ones to look at those.

Question to @cdondrup: I presume this was based on the latest set of parameters etc. Just for the record, can you comment and link the exact commits/released versions you have been using for your tests. Also, given you previous experience, would you say this was already better than what we had in the past, as we had some improvements (accelerations, laser position, etc) implemented already?

In any case, I put a doodle for the next meeting out: https://doodle.com/poll/mtynvcxw23fnsmdp

Please pick you slots quickly, so we can have the meeting in short term.

bfalacerda commented 8 years ago

i'd expect the robot to at least try to get closer to the wp in the second situation, even if he doesnt manage to get through it... I'll also try that and see what happens. Is float64 xy_goal_tolerance of the waypoints set to something?

Regarding the last one, that's the expected behaviour and I also think it's acceptable. we cant really hope to do anything better than that.

I have a suggestion regarding this intermediate waypoints issue, and also all the weird trajectories the robot takes when following a policy between nodes: the current version changes goal once we get into the influence area of a node, and the next action is also move_base/human_aware. I suggest a two step lookahead instead, i.e., if we have a waypoint sequence A->B->C->D, all with move_base, once we get to the influence area of B, we send a move_base goal to D. This will surely make the robot's movement much smoother (e.g., it'll probably avoid the weird nav in the AAF corridors), and will also make it more robust to occupied intermediate nodes. The only drawback I see is when the robot is doing B->C->D, and ends up not getting into C's influence area, for example because of an obstacle. We'd have to be careful with that during execution, and with the nav stats.

@Jailander does this sound sensible?

Another option would be to change the move_base goal immediately after the current target becomes the closest_node. That would have the same advantages and issues more or less.

Jailander commented 8 years ago

@bfalacerda that could be possible I'm unsure of the consequences of this in terms of making the robot follow the actual topological route and not taking another route, I'll put it in as soon as I have time on a separate branch and test it in simulation.

About the XY tolerance you are right this this should have worked, I'll rerun the tests today

EDIT: we already change the move_base goal immediately after the current target becomes the closest_node (forgot to write this)

cburbridge commented 8 years ago

This needs fixing in the simulation as it might have an effect on the testing results https://github.com/strands-project/strands_morse/issues/143

bfalacerda commented 8 years ago

@Jailander when was that changed? is nav smoother now?

Jailander commented 8 years ago

oops no my mistake, just checked, I did that in the ICRA branch but never pushed it upstream as that branch is substantially different from our main system, the behaviour it had was worse, the reason for that is that when the robot changed goal sooner it was going faster so it stopped harder (but in that case there was a bigger time gap between move base goals because we switched maps too), I am not sure if we really should try it in the main system too, do you think its necessary?

bfalacerda commented 8 years ago

I think he shouldn't stop at all, if you send a new goal to move_base it usually changes quite smoothly. We have deceleration now because he was getting close to the goal before receiving the new one, but if we send new goals from further away it should be smoother. Why does he stop when he gets a new goal?

Jailander commented 8 years ago

hmm in that case it could have been because there was no (metric) map for some time, I'll test it on normal conditions and tell you.

marc-hanheide commented 8 years ago

Minutes of Meeting:

We only use the parameters in strands_movebase
We re-released strands_movebase, strands_navigation, scitos_common to be up to date
Then everybody should replicate the tests according to the video¯

marc-hanheide commented 8 years ago

All releases are out. Time to update and run the tests.

nilsbore commented 8 years ago

Unfortunately, it looks like Rosie's PCB power board has fried again, meaning we have no main computer atm. We'll see what happens but it likely won't be possible to run tests this week.

marc-hanheide commented 8 years ago

Ah, bugger... OK, we'll have to see what we can do here and hopefully Birmingham @kunzel @bfalacerda will be able to run on their site?

bfalacerda commented 8 years ago

yes, we have student that's going to help, I'm meeting him today to set up the testing. Just to confirm, the idea now is to run the static tests one by one and report the result right?

bfalacerda commented 8 years ago

So we have the robot ready to run tests, we just need to make some arenas resembling the scenarios.

~~@Jailander @cdondrup the output from the "push robot to start location" always tells me I'm 2 meters away from the goal.~~ Ignore this, the topo node pose wasn't being updated properly for some reason

strands-project / strands_movebase

Making movebase work #62

Identified issues

Test cases

static tests

dynamic tests

Thinks to do

Further ideas: