cartographer-project / cartographer_ros

Provides ROS integration for Cartographer.
Apache License 2.0
1.66k stars 1.21k forks source link

Split local and global mapping on two machines #819

Closed ecidon closed 6 years ago

ecidon commented 6 years ago

Hello,

We are a research group from Stanford, we are looking at cartographer as the basis for our research. We were wondering if there is a standard way of splitting the code to two machines? Specifically we would like the local optimization from the paper to happen in one device and the global optimization to run on a server.

Thanks,

Eyal

gaschler commented 6 years ago

Yes, this is actually a (yet undocumented) feature. It requires gRPC. When you build cartographer and cartographer_ros, set CMake option -DBUILD_GRPC=True. The easiest way is then to set up a ROS launch config that uses multiple cartographer_grpc_server and cartographer_grpc_node as bridges from rosbag and to rviz. Take a look at the example in https://github.com/gaschler/cartographer_fetch/blob/video-1/cartographer_fetch/launch/video_cloud_1.launch and the lua configs referenced in there.

ecidon commented 6 years ago

Thanks will check it out

ecidon commented 6 years ago

Hi,

We tried running the launch file in the link. We downloaded the bags from: https://google-cartographer-ros-for-fetch-robotics-platforms.readthedocs.io/en/latest/

We get the following error when running the example: screenshot from 2018-04-11 18 10 06 screenshot from 2018-04-11 18 10 33

cschuet commented 6 years ago

@ecidon Apologies for the inconvenience. https://github.com/googlecartographer/cartographer/pull/1057 should fix that. Cloud-based mapping is something we are still actively working on. Please note also that at the moment we are using a gRPC streaming connection between local and global SLAM to propagate sensor data and local SLAM updates, which does not support ACKs and resending, hence this is not going to work over spotty WiFi. In the next couple of days we will change that to a batch uploading strategy which supports buffering and retries.

Can you share some information about your usecase? We are super interested in potential users and their needs.

ecidon commented 6 years ago

Hi @cschuet,

We reran the video_cloud_1 launch file and we now get the following error: [FATAL] [1523651726.193253447]: F0413 13:35:26.000000 16729 pose_graph_stub.cc:54] Check failed: client.Write(request)

screenshot from 2018-04-13 13 35 34

We then tried instead to modify cartographer_ros/cartographer_ros/launch/grpc_demo_backpack_2d.launch based on video_cloud_1.

Our modified launch file:

<launch>
  <param name="/use_sim_time" value="true" />

  <param name="robot_description"
    textfile="$(find cartographer_ros)/urdf/backpack_2d.urdf" />

  <node name="robot_state_publisher" pkg="robot_state_publisher"
    type="robot_state_publisher" />

  <node name="cartographer_grpc_server" pkg="cartographer_ros"
      type="cartographer_grpc_server.sh" args="
          -configuration_directory $(find cartographer_ros)/configuration_files
          -configuration_basename backpack_2d_server.lua">
  </node>

  <node name="cartographer_grpc_node" pkg="cartographer_ros"
      type="cartographer_grpc_node" args="
          -configuration_directory $(find cartographer_ros)/configuration_files
          -configuration_basename backpack_2d.lua
          -server_address localhost:50051"
      output="screen">
    <remap from="echoes" to="horizontal_laser_2d" />
  </node>

  <node name="cartographer_grpc_bridge_cloud" pkg="cartographer_ros"
      type="cartographer_grpc_node" args="
          -configuration_directory $(find cartographer_ros)/configuration_files
          -configuration_basename backpack_2d.lua
          -server_address localhost:50100"
      output="screen">
    <remap from="echoes" to="horizontal_laser_2d" />
  </node>

  <node name="cartographer_grpc_cloud" pkg="cartographer_ros"
      type="cartographer_grpc_server.sh" args="
          -configuration_directory $(find cartographer_fetch)/configuration_files
          -configuration_basename cloud_server.lua">
  </node>

  <node name="playbag" pkg="rosbag" type="play"
        args="--clock $(arg bag_filename)" />
  <node name="rviz" pkg="rviz" type="rviz" required="true"
      args="-d $(find cartographer_ros)/configuration_files/demo_2d.rviz" />
</launch>

We get the same error: [FATAL] [1523652188.148686489, 1461760302.894621795]: F0413 13:43:08.000000 17506 pose_graph_stub.cc:103] Check failed: client.Write(request)

We would be happy to chat via email if you would like.

Thanks!

kdaun commented 6 years ago

We located a Bug (https://github.com/googlecartographer/cartographer/issues/1071) that seems related to your issue and we are working on fixing it asap.

kdaun commented 6 years ago

The fix (https://github.com/googlecartographer/cartographer/pull/1073) is merged. Apologies for the inconvenience.

ecidon commented 6 years ago

I am still getting this error in the video_cloud_1 launch. Different line in the code though: [FATAL] [1523921476.266674477]: F0416 16:31:16.000000 13578 pose_graph_stub.cc:74] Check failed: client.Write(request)

I played around with my launch file from above, and I manage to recreate this problem when I split the cloud and local to two groups using two different ROS masters. Btw why do you do this in video_cloud_1?

When I use a single ROS master for all the nodes I get a different error: [ERROR] [1523921633.366962472, 1461760303.644203068]: E0416 16:33:53.000000 14126 map_builder_bridge.cc:175] Requested submap from trajectory 1 but there are only 1 trajectories.

gaschler commented 6 years ago

You may need to merge https://github.com/gaschler/cartographer_fetch/blob/video-1/cartographer_fetch/launch/video_cloud_1.launch first, I wrote it in January and config parameters may have been renamed and added since then because cloud/ is in development. The purpose of this example is to simulate two agents doing local slam and one global slam server on one machine and record a demo video that shows three rviz windows, therefore the groups.

cschuet commented 6 years ago

Hi @ecidon, sorry for the inconvenience but this is still very much WIP. I have put up a branch for demonstration here. The following diagram hopefully explains a little better what this does

image

First start the cloud side, i.e.

ROS_MASTER_URI="http://localhost:11111" roslaunch cartographer_ros cloud.launch

This should open up an rviz instance as well. Then you can start the mapping robot using

roslaunch cartographer_ros robot.launch bag_filename:=${HOME}/Downloads/cartographer_paper_deutsches_museum.bag A second rviz opens that show the view inside the robot. You can independently change settings on robot and cloud by modifying robot.lua and cloud.lua. E.g. disabling global SLAM on the robot could be achieved by overriding the optimize_every_n_nodes = 0.

ojura commented 6 years ago

Wait, why would you have two grpc servers? What's the difference between this and having a single server which would talk to grpc nodes on robots?

cschuet commented 6 years ago

@ojura Bandwidth and latency. Running cartographer_grpc_server on the robot for local SLAM or in pure localization mode (if you can afford it) and only uploading local SLAM results to the cloud has much less bandwidth requirements + plus you don't incur the network latency when you want to use cartographer localization locally for feeding to a nav stack.

ojura commented 6 years ago

I thought that local slam runs inside the grpc node and uploads the results to the server which does global slamming. It seems that's wrong, the server performs local slam as well?

Why is there a distinction between the grpc node and the server? Is there a situation in which you might want to have multiple grpc nodes on a single server?

In this scenario, you would configure many servers to do only local slam, and only one uplink server to do global slam, right?

cschuet commented 6 years ago

The cartographer_grpc_node is really just a ROS to gRPC bridge. It does some TF transformations to bring everything into a joint coordinate frame but that's it. cartographer_grpc_server is basically the SLAM algorithm behind a gRPC interface, i.e. a gRPCfication of MapBuilderInterface. In the cloud we usually don't run the cartographer_grpc_node at all. We attach our own visualizer directly to the cartographer_grpc_server. So the reason why there is a distinction between grpc node and the server is that we want to decouple cartographer from ROS. If you have a non-ROS based robot you would write a cartographer_grpc_other_middleware_node.

ojura commented 6 years ago

Thanks for the explanation!

ecidon commented 6 years ago

Thanks for the explanation and the example!

ivonaj commented 6 years ago

@cschuet We are trying to send submaps to the cloud more often than only after they are completed. Is there a parameter we could tune to send them to the cloud as often as possible, or do you have tips on how to implement this?

@AnaBatinovic

cschuet commented 6 years ago

@ivonaj I currently don't have time to work on this, but maybe I can provide some pointers. For unfinished submaps the num_range_data property is used as a version number. When e.g. rviz requests a list of submaps from Cartographer and sees either a new submap or a known submaps whose num_range_data property has changed, it requests from the SubmapQueryService the download of the corresponding probability grid.

Take a look at MapBuilderServer::OnLocalSlamResult which is invoked whenever a local SLAM result is available (even if no insertion into a submap happened), therein CreateSensorDataForLocalSlamResult is called which is responsible for serializing the insertion submaps and adding them to the LocalSlamResultData. Here you can find the logic that sends only finished submaps. I dimly remember that I set the num_range_data property to 0 for uploading unfinished submaps (w/o a probability) grid so you might have to change that too. On the receiving side currently submap->finished = 0 iff submap contains a probability grid. This will change with your change and you should make sure that no logic falsely assumes the equivalence of the two.

Hope that helps.

oprezz commented 4 years ago

Hello!

I would like to try this feature, but on one machine at first. I followed the instructions, and after starting any of the nodes: $ ROS_MASTER_URI="http://localhost:11111" roslaunch cartographer_ros cloud.launch $ roslaunch cartographer_ros robot.launch bag_filename:=${HOME}/Downloads/cartographer_paper_deutsches_museum.bag

I get the same error: image What am I possibly doing wrong? Any advice would be much appreciated!

Marton