ctu-mrs / mrs_aloam_core

Metapackage for running the system with ALOAM
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Simulation using ALOAM consuming all network bandwidth #1

Open laurenbramblett opened 2 months ago

laurenbramblett commented 2 months ago

Hello,

I am using ALOAM on both the simulator and an F4F drone for localization. When running on the drone and connected to WIFI, I have no issues, but when running on the desktop and connected via WIFI, it consumes all the bandwidth on the network. Have you all ever experienced this? It is isolated to when the mrs_aloam_core simulation is running. Before running slam_pipeline.launch there are no network issues and after the download speed for other computers on the network is reduced to 1Kb/s.

Thank you!

klaxalk commented 2 months ago

Hello, is it possible that it goes through one of your physical interfaces? What is your ROS_MASTER_URI setup. You should have the following in your .bashrc for local simulation.

export ROS_MASTER_URI=http://localhost:11311
export ROS_HOSTNAME=localhost
export ROS_IP=127.0.0.1
pritzvac commented 2 months ago

Also, check that you have 127.0.0.1 localhost in your \etc\hosts file, I think something similar happened to me, which was caused by this line missing.

laurenbramblett commented 2 months ago

Hi! Unfortunately those did not fix the issue. I had the bashrc setup as you said and the localhost in the etc/hosts file already

export ROS_MASTER_URI=http://localhost:11311
export ROS_HOSTNAME=localhost
export ROS_IP=127.0.0.1

Would there be a reason why it is only happening when aloam is running?

pritzvac commented 2 months ago

The lidar pointclouds processed by aloam consist of quite a lot of data, so if they are running over the wi-fi instead of just in your pc, I guess it could cause network issues. I still think it could be caused by the settings we mentioned. Maybe you have something else in your .bashrc or /etc/hosts that overrides these settings?

Try to check the values of these environmental variables when running the simulation (e.g. echo $ROS_MASTER_URI...). Also if this happens again, try to run ping localhost and traceroute localhost in the simulation session to see if it pings the right ip and if it goes over the wi-fi or not.

laurenbramblett commented 2 months ago

Hello,

I ran traceroute localhost and also echo'd ROS_MASTER_URI and both point to 127.0.0.1 correctly. I also only see one hop: image Also I see another error only when running mrs_aloam_core in the gazebo pane: image

pritzvac commented 2 months ago

That's weird. When you say that you're running the simulation on the desktop and connected via wi-fi, are you running the simulation and visualizing data on the same pc, which is just connected to a wi-fi? Or are you running a simulation on a remote desktop and visualizing data on another pc?

If you're visualizing the data on another pc and using the default rviz config, try to uncheck the unreliable option of the ouster pointcloud topic in RViz.

laurenbramblett commented 1 month ago

Hi! Sorry for the delay. I am running the simulation on my desktop and visualizing on the same pc. I also have disabled the rviz launch and set the gazebo gui visualization to false with the same results. The problem seems to only persist when I launch a drone with a 3d lidar (ouster or velodyne) in the forest world; however, most of the other core simulations are much more lightweight and still cause a slight uptick in the network traffic. I am still not sure why it is sending data over the network either even when I set my GAZEBO_IP to localhost as well

laurenbramblett commented 1 month ago

One more note: when I run sudo iftop to check the network traffic during the simulation, I see that a bunch of data is being sent to st-routers.mcast.net

pritzvac commented 1 month ago

In my case, I have all of the following lines in \etc\hosts, maybe check if you have all of this (replace \<hostname> with the hostname of your pc):

127.0.0.1 localhost
127.0.1.1 <hostname>

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

Also you can try to run roswtf when the simulation is running, to see if it detects any helpful errors.

But I wasn't able to reproduce this behavior even when I tried to. In my case, I also see some traffic to st-routers.mcast.net and see some some data going through the wi-fi interface when the simulation is running, but I see no increase when a lidar is simulated / when ALOAM is running.

What's the amount of data going through your network interfaces, when you use nload to measure it? In aloam simulation , I have around 110 MBit / s going through the loopback (lo) interface and around 500 kBit / s incoming and 130 kBit / s outgoing over the wi-fi. In the mrs_uav_gazebo_simulation/tmux/one_drone, I have around 20 MBit / s going through loopback and the traffic through wi-fi is identical as in the ALOAM simulation.

Also, is the simulation working fine, when you're not connected to a wi-fi?

laurenbramblett commented 1 month ago

Hello,

Yes I have checked those and made them exactly the same as yours with no differences. I think I have isolated the problem. It appears to be a problem with gazebo. If I am using the forest world my network is 10x slower than the grass_plane (the difference in ping speed is 6ms versus 2000ms). Is there something in the libMrsGazeboCommonResources_StaticTransformRepublisher.so that would eat up bandwidth? I also noticed that if I set the gui:=false the ping speed is halved (3ms for grass_plane and 1000ms for forest).

The amount of data outgoing for the forest world is approximately 1MBit/s and I am unable to ping the router from another computer. The grass_plane world has 30kBit/s. Both were tested with gui set to false.

The simulation works just fine when not connected to wi-fi so it is only an issue if I need to remote in to the desktop and others are working on the same router in the lab.

Thank you for your help!

laurenbramblett commented 1 month ago

Also of note, I connected to the router without wifi and had no ping speed slowdowns. It is only the outgoing over internet that seems to be the issue

pritzvac commented 1 month ago

As far as I know, the static transform republisher shouldn't use up network bandwidth + it's in both of the worlds that you mention. The only connection I see there is that your bandwidth usage seems to increase when you utilize GPU more (gazebo GUI uses GPU, 3D Lidar simulation uses GPU heavily). Is it possible that you have something on your pc that would cause it to use up network bandwidth when GPU is utilized?

You can try to run https://github.com/ctu-mrs/mrs_uav_gazebo_simulation/tree/master/ros_packages/mrs_uav_gazebo_simulation/tmux/one_drone simulation and change grass_plane to forest + run https://github.com/ctu-mrs/mrs_uav_gazebo_simulation/tree/master/ros_packages/mrs_uav_gazebo_simulation/tmux/one_drone_3dlidar and change forest to grass_plane and check the ping to see if it's caused by the specific world or by the lidar simulation.

Also try to install and run sudo nethogs to see if it shows you, which process actually eats up the bandwidth.