Kinovarobotics / kinova-movo

Source code of the Kinova MOVO platform
BSD 3-Clause "New" or "Revised" License
43 stars 37 forks source link

Issues with upgrading to Kinetic - no ros master being launched #55

Open ericrosenbrown opened 5 years ago

ericrosenbrown commented 5 years ago

Hello,

I am trying to upgrade our movo to kinetic. We first upgraded our system to 16.04, and then upgraded to kinetic. At first our movo_ws had issues building, but we were eventually able to resolve all the issues and get a fully-built movo_ws. However, even though we now have a built movo_ws, when we restart the movo, there is no ros master being launched, and it does not move at all on start up.

We also apt-get upgraded everything, but that still has not resolved the issue. Is there something we're missing in the upgrade process? How do we get the movo to launch the master ros node at launch so that we can actually use the movo?

Thanks!

alexvannobel commented 5 years ago

Hi Eric! The MOVO upgrade to Kinetic is being worked on, and will only become official in a few months. However, having worked myself on the migration, I may have a couple suggestions for you :

  1. First of all, did you migrate both the MOVO PC's to Ubuntu 16.04? The MOVO2 computer is easily accessible through the HMI on the back of MOVO, but the MOVO1 computer is not physically accessible if you want to keep MOVO's skin on. You will first need to setup MOVO2 as a NAT for MOVO1 to gain access to the Internet (see the script at https://github.com/Kinovarobotics/kinova-movo/blob/kinetic-devel/movo_common/si_utils/scripts/setup_internet_on_movo1), and then upgrade MOVO1 to Ubuntu 16.04 via SSH. We strongly recommend that you backup MOVO1 first though, which sadly requires you to remove the skin if you want to have a physical access to MOVO1.

  2. If you did migrate both PCs and the movo_ws builds fine on both machines, I suspect that the upstart scripts are not up-to-date. I suggest that you checkout our kinetic-devel branch, which has not been officially tested by our Verification and Validation team but is functional under Kinetic. The system-specific packages have been adapted to 16.04. You can also look at https://github.com/Kinovarobotics/kinova-movo/blob/kinetic-devel/movo_common/si_utils/scripts/setup_movo_pc_migration to double check if you don't have missing packages and especially at lines 110 to 112 to modify your bashrc aliases to use systemd instead of upstart.

Let us know how this works out! Cheers, Alex

ericrosenbrown commented 5 years ago

Hi Alex,

Thank you for the reply! We have not yet upgraded the MOVO1 and we are now in the process of trying to get this to function. We have successfully backed up MOVO1; however we are unable to gain access to the internet through the provided script. Is there anything that we may need to change in the script in order for MOVO1 to gain access to the internet? Thank you!

alexvannobel commented 5 years ago

Hi Eric, The script worked as-is for me, but I can point you to a couple details:

  1. This script has to be run on MOVO2, which needs to have access to Internet (here we use an Ethernet to USB adapter) and also needs to be able to SSH to MOVO1.
  2. Line 61 :
    sudo iptables -t nat -A POSTROUTING -o eth1 -j MASQUERADE

    needs the interface on which the Internet is on MOVO2. I put eth1 because my Ethernet to USB adapter was on this interface, but you have to put your own interface there (check out ifconfig!) Related to #36.

Hope this helps, Alex

ericrosenbrown commented 5 years ago

Hi Alex,

We were able to update both the MOVO1 and MOVO2 to Ubuntu 16.04 and did all of the necessary steps as outlined by the setup_movo_pc_migration script (I also reran both of these scripts after clearing all past installations to be sure). Both of the movo_ws's build perfectly and the ~/.bashrc hsa the correct aliases so that it uses systemd rather than upstart. Even with all of this setup, there is no ros master being launched and the movo still does not move. What may be the problem? Thank you!

Best, Jonathan

belgiumkansas commented 5 years ago

We migrated to 16.04/kinetic a few months ago. Sounds you are having a systemd issue. have you tried manually launching the movo_system.launch in the bringup package? Also make sure you have your SSH keys setup as the roslaunch machine tags require movo2 to SSH into movo1.

alexvannobel commented 5 years ago

@belgiumkansas's tips are great starting points. I can add a couple tips too :

Cheers, Alex

ericrosenbrown commented 5 years ago

Hi @belgiumkansas,

Thank you for your suggestions. I just wanted to ask if you could clarify what you meant by setting up the SSH keys? Currently, we are able to ssh into movo1 from movo2. Would this be sufficient?

@alexvannobel As you suggested, we ran the rosrun commands. This was definitely something that we needed to do. After running these commands, we are able to run movo1.launch and movo2.launch individually. When done this way, we are able to get some of the topics functioning such as the kinect2 topics; however, there still seems to be an error on most of the move_group topics. Furthermore, we are unable to launch the movo_system.launch script which does not launch any of the topics that worked when we launched the individual files.

The $ROS_MASTER_URI is http://127.0.0.1:11311/ for both computers.

Thanks again.

Best, Jonathan

ericrosenbrown commented 5 years ago

Hi @alexvannobel

I just wanted to give you an update on my progress. I determined that there was most likely a networking issue so I went into both the movo_network_config.bash files for movo1 and movo2 and changed the ROS_IP to their respective 10.66.171.# ips (#=1 for movo2 and 2 for movo1). After doing that I ran the launch file for each computer respectively, and I am still running into some issues. The errors that I am getting with movo2.launch are the following

similar errors to [ERROR] [1542127719.869167559]: Action client not connected: movo/right_arm_controller/follow_joint_trajectory

For movo1.launch I am getting [ INFO] [1542148443.429730825]: Load description from: /robot_description Traceback (most recent call last): File "/home/movo/movo_ws/src/kinova-movo/movo_common/movo_ros/bin/movo_wd", line 48, in movo_wd = MovoWatchdog(pc_name) File "/home/movo/movo_ws/src/kinova-movo/movo_common/movo_ros/src/movo/movo_system_wd.py", line 64, in init self.conn.bind(('',6234)) File "/usr/lib/python2.7/socket.py", line 228, in meth return getattr(self._sock,name)(args) socket.error: [Errno 98] Address already in use [movo1_wd-1] process has died [pid 4104, exit code 1, cmd /home/movo/movo_ws/src/kinova-movo/movo_common/movo_ros/bin/movo_wd __name:=movo1_wd __log:=/home/movo/.ros/log/2ca88d50-e794-11e8-8906-94c69112651a/movo1_wd-1.log]. log file: /home/movo/.ros/log/2ca88d50-e794-11e8-8906-94c69112651a/movo1_wd-1.log [ERROR] [1542148444.378345]: Could not open socket for MOVO pan_tilt...exiting

Any advice would be helpful. Thank you!

Best, Jonathan

alexvannobel commented 5 years ago

Hi Jonathan,

It seems indeed that your network config is not correctly setup. I have had the same problem too when I would checkout to other branches or reset the movo_network_config.bash, and it seems to me that the the version on the master branch : https://github.com/Kinovarobotics/kinova-movo/blob/master/movo_network/movo_network_config.bash always gives the right network config for both MOVO1 and MOVO2. You can copy-paste its contents to your own movo_network_config.bash, uninstall_movo_core, reboot both MOVOs and try again. I think it should fix your MOVO ROS network config. I am pretty confident the errors you are getting are due to the network not being properly setup.

Cheers, Alex

jdchang1 commented 5 years ago

Hi @alexvannobel,

Thanks again for your help. There definitely seemed to be an issue with the networking. Now we have confirmed that MOVO2 has been set to master for both computers and ROS_IP is set accordingly to each computer. I am still, though, getting and error on MOVO1 with opening the sockets for various tasks. The first error that occurs is the following

[ INFO] [1542217495.648300493]: Load description from: /robot_description [ERROR] [1542217496.093639]: Could not open socket for MOVO pan_tilt...exiting

I looked into the python script that is outputting this error and this error is caused when the following occurs in the constructor of the PanTiltIO

self._cmd_buffer = multiprocessing.Queue() self.txqueue = multiprocessing.Queue() self.rxqueue = multiprocessing.Queue() self.comm = IoEthThread((movo_ip,6237), self.txqueue, self.rxqueue, max_packet_size=KINOVA_ACTUATOR_RSP_SIZE_BYTES)

if (False == self.comm.link_up): rospy.logerr("Could not open socket for MOVO pan_tilt...exiting") self.Shutdown() return

I am not sure how I may fix this. Do you have any input?

Best, Jonathan

alexvannobel commented 5 years ago

Hi Jonathan,

There doesn't seem to be any problem with this code, it is most likely a setup issue. When you SSH to movo1 from movo2, does SSH prompt you with movo1's password? MOVO1 and MOVO2 need to have each other as known hosts so their passwords are never prompted (from 1--->2 and also from 2--->1). As ROS automatically opens sockets and SSH connections, a password prompt can mess with a lot of the nodes. To make sure both MOVO computers know each other as known hosts, you can run in a bash console: on MOVO1 :

ssh-copy-id movo@MOVO2 

on MOVO2 :

ssh-copy-id movo@MOVO1

Those commands will prompt you with entering the password, and will register the computer as a known SSH host. You can test it worked by ssh'ing to the other computer.

Let me know if it fixes your problem!

Cheers, Alex

jdchang1 commented 5 years ago

Hi Alex,

So I tried to ssh and was not prompted for a password. Just in case, I ran the above commands and was informed that the relevant hosts already exist. I am still able to ssh from one computer to the other without any problems.

The issue with movo1.launch still seems to be a socket issue with movo pan_tilt. Just to provide some extra information, the movo's grippers open and close on startup (as usual), but does not continue with the startup. I was able to fix the socket errors by changing the sockets in multiple scripts located in movo_common/movo_ros/src/movo/ but there were still subsequent errors so I reverted the scripts back to what they were.

Do you have any other suggestions?

Thank you!

jdchang1 commented 5 years ago

Hi again Alex,

We were just able to fix our movo. The problem was that the environment variable saying that movo has a 7 dof arm was set to false and the 6 dof variable was set to true on the kinetic devel branch by default, but out Movo has a 7dof arm, so moveit was failing on startup thinking the movo was in self-collision.

After this success, we tried looking for rostopics and checking the data and noticed that the ROS_MASTER_URI is now set to MOVO1 instead of MOVO2. Is this something you guys did intentionally for the kinetic_devel branch? If not, how could we go about changing the ROS_MASTER_URI to be MOVO2 by default? We also noticed that a lot of the old kinect2 topics that we used to use on the Indigo branch are now gone. Is there any way we can get these topics back or is this by design?

Thank you so much for all your help and speedy replies! It has really made this process much better.

alexvannobel commented 5 years ago

Hi Jonathan!

I was about to provide you with some more tips but I'm glad to hear you got it working! I will look closer into the movo_config for the kinetic-devel branch to make sure the environment variables are set correctly because MOVO2 should really still be the master in ROS Kinetic.

Thanks to you for your detailed explanations!

Cheers, Alex

pragathip commented 5 years ago

We're having similar issues after migration from Indigo to Kinetic. The grippers turn on and off for a bit, but the arms don't move. Appropriate changes to the variables have been made to movo_network_config.bash and movo_config.bash after the update. We have the same error that @jdchang1 mentions: Could not open socket for MOVO pan_tilt...exiting even after changing the environment variables. Any suggestions on how we can proceed with the debugging?