osrf / srcsim

Space Robotics Challenge
Other
9 stars 3 forks source link

Traffic shaping seems to break ros connection between FC and SIM #230

Closed osrf-migration closed 7 years ago

osrf-migration commented 7 years ago

Original report (archived issue) by Jeremy White (Bitbucket: knitfoo).


In experimenting with the traffic shaping on a current container, I have found that if I apply shaping, and then start the fc, things work as I expect. If, however, I change the shaping parameters (even to make them 'nicer') while the fc is running, the fc loses it's ros connection to the sim, and cannot re-establish it until it is restarted.

I gather that is the planned production behavior (change shaping after each task completes), so I wanted to make sure to point it out now.

The relevant stack trace is:

#!python

  File "./lib/zarj/walk.py", line 88, in __init__
    if rospy.has_param(rfp) and rospy.has_param(lfp):
  File "/opt/ros/indigo/lib/python2.7/dist-packages/rospy/client.py", line 546, in has_param
    return param_name in _param_server #MasterProxy does all the magic for us
  File "/opt/ros/indigo/lib/python2.7/dist-packages/rospy/msproxy.py", line 195, in __contains__
    code, msg, value = self.target.hasParam(rospy.names.get_caller_id(), rospy.names.resolve_name(key))
  File "/usr/lib/python2.7/xmlrpclib.py", line 1233, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib/python2.7/xmlrpclib.py", line 1587, in __request
    verbose=self.__verbose
  File "/usr/lib/python2.7/xmlrpclib.py", line 1273, in request
    return self.single_request(host, handler, request_body, verbose)
  File "/usr/lib/python2.7/xmlrpclib.py", line 1303, in single_request
    response = h.getresponse(buffering=True)
  File "/usr/lib/python2.7/httplib.py", line 1089, in getresponse
    response.begin()
  File "/usr/lib/python2.7/httplib.py", line 444, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.7/httplib.py", line 400, in _read_status
    line = self.fp.readline(_MAXLINE + 1)
  File "/usr/lib/python2.7/socket.py", line 476, in readline
    data = self._sock.recv(self._rbufsize)
osrf-migration commented 7 years ago

Original comment by Ian Chen (Bitbucket: Ian Chen, GitHub: iche033).


I found IP filtering wasn't working last night after responding to another thread and made some tweaks. It was also applying network restrictions between the Sim and FC link. I wonder if that's the problem. Just making sure the version you're using is the same as the one in the pull request now (commit d98623f)

osrf-migration commented 7 years ago

Original comment by Ian Chen (Bitbucket: Ian Chen, GitHub: iche033).


I'll also test changing the TC params on the fly and see if I can reproduce the problem

osrf-migration commented 7 years ago

Original comment by Jeremy White (Bitbucket: knitfoo).


Yes, I'm using that script. I'm retesting this right now, to try to post the complete error (I see I cut off the error, which I think was connection closed).

As I look at the script, I suspect it's the fact that we always close ifb0; that probably causes us heartburn. If we have a script that doesn't do that, but just adjusts parameters, we'll probably be okay.

osrf-migration commented 7 years ago

Original comment by Jeremy White (Bitbucket: knitfoo).


Wait, this gets stranger. It appears as though that script is shaping the whole connection, not just the .150 endpoint.

For example:

#!bash

root@ip-172-31-19-11:~# ./src_tc.rb -i tap0 -d 100mbit -u 100mbit -f 192.168.2.150/26 -l 20ms
RTNETLINK answers: File exists
root@ip-172-31-19-11:~# docker exec -it team_container bash
root@46d72a7c59e0:/home/docker/ws# ping 192.168.2.1                                                     
PING 192.168.2.1 (192.168.2.1) 56(84) bytes of data.
64 bytes from 192.168.2.1: icmp_seq=1 ttl=64 time=40.9 ms
64 bytes from 192.168.2.1: icmp_seq=2 ttl=64 time=40.9 ms
64 bytes from 192.168.2.1: icmp_seq=3 ttl=64 time=40.9 ms
64 bytes from 192.168.2.1: icmp_seq=4 ttl=64 time=41.0 ms

The 40ms ping time is clearly showing that the filter is being used for the 2.1 target. The bandwidth limits also apply, as I learned the hard way.

osrf-migration commented 7 years ago

Original comment by Ian Chen (Bitbucket: Ian Chen, GitHub: iche033).


I have been testing this but somehow I can't reproduce the issue: Here's what I'm doing

Inside the FC docker container I keep rostopic echo running:

rostopic echo /ihmc_ros/valkyrie/output/robot_pose

On FC host I change TC settings:

sudo ./src_tc.rb -i tap0 -d 500kbit -u 2mbit -f 192.168.2.150/26 -l 250ms

# wait a little and then

sudo ./src_tc.rb -i tap0 -d 500kbit -u 2mbit -f 192.168.2.150/26 -l 500ms

and On OCU I keep the ping running:

ping 192.168.2.10

Results:

I see the rostopic echo still working inside FC container and the rostopic hz reports 500hz,

OCU -> FC ping changed to 507ms and then 1007ms

Maybe I missed something?

osrf-migration commented 7 years ago

Original comment by Jeremy White (Bitbucket: knitfoo).


I think this may be my error. I just tried to reproduce this with d98623f, and it did not reproduce. I then went and followed the same procedure I used before, I saw that I got f960306. If I download that commit, then the error occurs.

Gah! And now I can't reproduce the pathway that confused me. That is, I swear to you that I went to the cloudsim-sim tree, and grabbed what appeared to be the very latest commit. But now when I revisit that page, it no longer appears to be the latest.

At any rate, this appears to be my error; sorry for the churn.

osrf-migration commented 7 years ago

Original comment by Jeremy White (Bitbucket: knitfoo).


Looks to be my error.