gavanderhoorn / fanuc_driver_exp

An alternative - experimental - Fanuc robot driver for ROS-Industrial
Apache License 2.0
28 stars 17 forks source link

Cannot connect socket to controller #2

Closed chrisgilmerproj closed 7 years ago

chrisgilmerproj commented 7 years ago

I am attempting to create a socket connection to read the robot state using the fanuc_driver_exp driver. I'm using an LR Mate 200iD robot on a R30iB controller.

I followed the configuration directions for the original fanuc_driver. I'm also using server tags 3 & 4, as well as setting these variables:

KAREL Vars for ROS_STATE:

Name Type Default Unit Description
checked BOOLEAN True - user flagged: config checked
loop_hz INTEGER 42 Hz main loop update rate (in Hz)
s_accpt_dlay INTEGER 100 ms 'return to sock_accept' delay (ms)
s_tag_nr INTEGER 3 - server TAG number to use
s_tcp_nr INTEGER 11002 - TCP port to listen on
um_clear BOOLEAN True - clear user menu on start

KAREL Vars for ROS_TRAJ:

Name Type Default Unit Description
checked BOOLEAN True - user flagged: config checked
f_msm_drdy INTEGER 2 - flag used for 'motion sm data ready' signal
f_msm_rdy INTEGER 1 - flag used for 'motion sm ready' signal
loop_hz INTEGER 42 Hz main loop update rate (in Hz)
move_cnt INTEGER 50 - CNT to use for each motion instruction
pr_move INTEGER 1 - pos reg to use for next traj pt
r_move_cnt INTEGER 2 - int reg to store traj segment CNT in
r_skip_sig INTEGER 6 - int reg monitored by SKIP CONDITION in TP
r_tseg_vel INTEGER 7 - int reg to store traj segment joint velocity percentage in
s_accpt_dlay INTEGER 100 ms 'return to sock_accept' delay (ms)
s_tag_nr INTEGER 4 - server TAG number to use
s_tcp_nr INTEGER 11000 - TCP port to listen on
tp_val_trstp INTEGER 11 - value to set 'r_skip_sig' reg to to communicate traj stop
trjpt_buf_sz INTEGER 10 - nr of traj pts to buffer
um_clear BOOLEAN True - clear user menu on start

I did not change any variables or flags in the ROS_MOVESM TP program assuming that all the defaults will work (this is a brand new robot).

To test the connection I want to bind to the correct socket and read any data that comes out:

#! /usr/bin/env python         

import socket                  
import time

HOST = '192.168.1.201'         
PORT = 11002                   

def main():
    while True:                
        try:
            print("Connecting to {}:{} ...".format(HOST, PORT))
            sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            sock.bind((HOST, PORT))         
            print('reading')   
            data = sock.recv(1024)          
            print(data)
            sock.close()
            break
        except socket.error:
            pass
        time.sleep(2)

if __name__ == "__main__":
    main()

This short program never binds. I have no errors or alarms and I'm running the ROS program in Auto mode. I see the following from Menu -> User (formatting is correct, meaning newlines are missing where they were expected):

31508 31508 I   I RSTA loaded grp:RTRJ 1

31508 31508 I   I RSTA init doneRTRJ buf
10 31508
I  31508RSTA wait client I RTRJ init done
31508 I RTRJ wait client

I know that I can connect to the robot over EthernetIP using the ODVA CIP protocol. This means I can ping the robot, and read and write registers all over ethernet. All this is to say I'm not worried about this being a network issue.

When I hit the Select button and choose the ROS program it takes me to line 16 of ROS_MOVESM, which reads:

15: !wait for relay
16: WAIT (F[2])

Any thoughts as to how to debug this and get a simple TCP socket connection open to read the ROS state messages?

gavanderhoorn commented 7 years ago

Hi, thanks for trying out the new driver.

A first note: I would not set s_accpt_dlay to 100 ms. This is the delay between a client losing/closing a connection (for whatever reason) and the server re-opening the socket for new connections. A too low value for this entry can result in a 'runaway' program as the TP is overwhelmed with messages and the UI gets blocked.

As to your question / issue: from the TP output I have the impression that no connection attempt has been made. The fact that EIP works is not conclusive proof that your network setup is correct.

After you have started the ROS program, can you try the following with either nc or ncat or something similar:

nc 192.168.1.201 11002

Can you let me know what the output of that is?

Also: if you could include a wireshark capture of your connection attempt I should be able to tell you more about the network side of things.

gavanderhoorn commented 7 years ago

I see the following from Menu -> User (formatting is correct, meaning newlines are missing where they were expected)

yes, that can happen. I need to spend some time reimplementing logging. Right now it's just each program directly writing to the TPDISPLAY, which is not a good approach when you have multiple.

gavanderhoorn commented 7 years ago

I did not change any variables or flags in the ROS_MOVESM TP program assuming that all the defaults will work (this is a brand new robot).

Your config could work, but you'll need to make sure that the values you've set for r_skip_sig (here, here and here), r_tseg_vel (here) and tp_val_trstp (here, here and here) are also used by your copy of ROS_MOVESM.

Note that both ROS_STATE as well as ROS_TRAJ will set sane defaults upon first start. If you have no other programs (running concurrently) on that controller you might want to use those, as you then don't need to edit ROS_MOVESM.

Btw: I've not had an opportunity to update / write any docs for this version of the driver, as I assumed that the people that would be testing it would have experience with the previous version and I have been supporting them via email. The docs for fanuc_driver do cover most of it though, as you probably found out.

gavanderhoorn commented 7 years ago

As for the msg layout and general protocol info: Message Structures of the ROS-Industrial Simple Message Protocol is a WIP document. fanuc_driver_exp uses JOINT_POSITION, JOINT_TRAJ_PT and STATUS messages.

chrisgilmerproj commented 7 years ago

Thank you for the quick reply and for all the detail. Looks like I'm running now.

Here is the new set of values I put in for the KAREL programs (using the defaults):

KAREL Vars for ROS_STATE:

Name Type Default Unit Description
checked BOOLEAN True - user flagged: config checked
loop_hz INTEGER 125 Hz main loop update rate (in Hz)
s_accpt_dlay INTEGER 1000 ms 'return to sock_accept' delay (ms)
s_tag_nr INTEGER 3 - server TAG number to use
s_tcp_nr INTEGER 11002 - TCP port to listen on
um_clear BOOLEAN True - clear user menu on start

KAREL Vars for ROS_TRAJ:

Name Type Default Unit Description
checked BOOLEAN True - user flagged: config checked
f_msm_drdy INTEGER 2 - flag used for 'motion sm data ready' signal
f_msm_rdy INTEGER 1 - flag used for 'motion sm ready' signal
loop_hz INTEGER 125 Hz main loop update rate (in Hz)
move_cnt INTEGER 50 - CNT to use for each motion instruction
pr_move INTEGER 1 - pos reg to use for next traj pt
r_move_cnt INTEGER 2 - int reg to store traj segment CNT in
r_skip_sig INTEGER 5 - int reg monitored by SKIP CONDITION in TP
r_tseg_vel INTEGER 1 - int reg to store traj segment joint velocity percentage in
s_accpt_dlay INTEGER 1000 ms 'return to sock_accept' delay (ms)
s_tag_nr INTEGER 4 - server TAG number to use
s_tcp_nr INTEGER 11000 - TCP port to listen on
tp_val_trstp INTEGER 1 - value to set 'r_skip_sig' reg to to communicate traj stop
trjpt_buf_sz INTEGER 6 - nr of traj pts to buffer
um_clear BOOLEAN True - clear user menu on start

I verified that I could ping the client:

$ ping 192.168.1.201

And saw this output:

$ sudo tshark -i en7 -Y "ip.addr==192.168.1.201"
15409 1125.000840 192.168.1.86 → 192.168.1.201 ICMP 98 Echo (ping) request  id=0x0a12, seq=0/0, ttl=64
15411 1125.001825 192.168.1.201 → 192.168.1.86 ICMP 98 Echo (ping) reply    id=0x0a12, seq=0/0, ttl=60 (request in 15409)
15426 1126.006156 192.168.1.86 → 192.168.1.201 ICMP 98 Echo (ping) request  id=0x0a12, seq=1/256, ttl=64
15427 1126.006974 192.168.1.201 → 192.168.1.86 ICMP 98 Echo (ping) reply    id=0x0a12, seq=1/256, ttl=60 (request in 15426)

I verified that I could connect with nc:

$  nc 192.168.1.201 11002
8
���Q?U�8���t�V>�9:+�8
���Q?U�8���t�V>�9:+�8
���Q?U�8���t�V>�9:+�8
���Q?U�8���t�V>�9:+�8
���Q?U�8���t�V>�9:+�8
���Q?U�8���t�V>�9:+�8
���Q?U�8���t�V>�9:+�8
���Q?U�8���t�V>�9:+�8
���Q?U�8���t�V>�9:+�8
���Q?U�8���t�V>�9:+�8
8��Q?U�8���t�V>�9:+�(
���Q?U�8���t�V>�9:+�8
���Q?U�8���t�V>�9:+�8
���Q?U�8���t�V>�9:+�8
���Q?U�8���t�V>�9:+�8
���Q?U�8���t�V>�9:+�8
���Q?U�8���t�V>�9:+�8
���Q?U�8���t�V>�9:+�8
���Q?U�8���t�V>�9:+�8
$ sudo tshark -i en7 -Y "ip.addr==192.168.1.201"
18746 1283.345928 192.168.1.201 → 192.168.1.86 TCP 98 11002 → 52202 [PSH, ACK] Seq=11925 Ack=1 Win=8192 Len=44
18747 1283.346062 192.168.1.86 → 192.168.1.201 TCP 54 52202 → 11002 [ACK] Seq=1 Ack=11969 Win=65535 Len=0
18748 1283.351828 192.168.1.201 → 192.168.1.86 TCP 114 11002 → 52202 [PSH, ACK] Seq=11969 Ack=1 Win=8192 Len=60
18749 1283.351910 192.168.1.86 → 192.168.1.201 TCP 54 52202 → 11002 [ACK] Seq=1 Ack=12029 Win=65535 Len=0
18750 1283.359840 192.168.1.201 → 192.168.1.86 TCP 114 11002 → 52202 [PSH, ACK] Seq=12029 Ack=1 Win=8192 Len=60
18751 1283.359940 192.168.1.86 → 192.168.1.201 TCP 54 52202 → 11002 [ACK] Seq=1 Ack=12089 Win=65535 Len=0
18752 1283.367933 192.168.1.201 → 192.168.1.86 TCP 114 11002 → 52202 [PSH, ACK] Seq=12089 Ack=1 Win=8192 Len=60
18753 1283.367978 192.168.1.86 → 192.168.1.201 TCP 54 52202 → 11002 [ACK] Seq=1 Ack=12149 Win=65535 Len=0
18754 1283.376692 192.168.1.201 → 192.168.1.86 TCP 114 11002 → 52202 [PSH, ACK] Seq=12149 Ack=1 Win=8192 Len=60
18755 1283.376732 192.168.1.86 → 192.168.1.201 TCP 54 52202 → 11002 [ACK] Seq=1 Ack=12209 Win=65535 Len=0
18756 1283.383855 192.168.1.201 → 192.168.1.86 TCP 114 11002 → 52202 [PSH, ACK] Seq=12209 Ack=1 Win=8192 Len=60
18757 1283.383884 192.168.1.86 → 192.168.1.201 TCP 54 52202 → 11002 [ACK] Seq=1 Ack=12269 Win=65535 Len=0
18758 1283.391834 192.168.1.201 → 192.168.1.86 TCP 114 11002 → 52202 [PSH, ACK] Seq=12269 Ack=1 Win=8192 Len=60
18759 1283.391857 192.168.1.86 → 192.168.1.201 TCP 54 52202 → 11002 [ACK] Seq=1 Ack=12329 Win=65535 Len=0
18760 1283.407803 192.168.1.201 → 192.168.1.86 TCP 114 11002 → 52202 [PSH, ACK] Seq=12329 Ack=1 Win=8192 Len=60
18761 1283.407823 192.168.1.86 → 192.168.1.201 TCP 54 52202 → 11002 [ACK] Seq=1 Ack=12389 Win=65535 Len=0
18762 1283.415816 192.168.1.201 → 192.168.1.86 TCP 114 11002 → 52202 [PSH, ACK] Seq=12389 Ack=1 Win=8192 Len=60
18763 1283.415834 192.168.1.86 → 192.168.1.201 TCP 54 52202 → 11002 [ACK] Seq=1 Ack=12449 Win=65535 Len=0
18764 1283.423846 192.168.1.201 → 192.168.1.86 TCP 114 11002 → 52202 [PSH, ACK] Seq=12449 Ack=1 Win=8192 Len=60
18765 1283.423874 192.168.1.86 → 192.168.1.201 TCP 54 52202 → 11002 [ACK] Seq=1 Ack=12509 Win=65535 Len=0
18766 1283.431900 192.168.1.201 → 192.168.1.86 TCP 114 11002 → 52202 [PSH, ACK] Seq=12509 Ack=1 Win=8192 Len=60
18767 1283.431930 192.168.1.86 → 192.168.1.201 TCP 54 52202 → 11002 [ACK] Seq=1 Ack=12569 Win=65535 Len=0
18768 1283.439851 192.168.1.201 → 192.168.1.86 TCP 114 11002 → 52202 [PSH, ACK] Seq=12569 Ack=1 Win=8192 Len=60
18769 1283.439890 192.168.1.86 → 192.168.1.201 TCP 54 52202 → 11002 [ACK] Seq=1 Ack=12629 Win=65535 Len=0

I did see some messages come across the User panel:

20188 E RSTA pub A err: -67208
20188 I RSTA hndl_client exit
20188 I RSTA disconnected
20188 I RSTA wait client
20202 I RSTA connected
20203 E RSTA pub A err: -67208
20203 I RSTA hndl_client exit
20203 I RSTA disconnected
20203 I RSTA wait client

Then I tried this (modified) python program which worked. Had to use connect instead of bind.

#! /usr/bin/env python         

import socket                  
import time

HOST = '192.168.1.201'         
PORT = 11002                   

def main():
    while True:                
        try:
            print("Connecting to {}:{} ...".format(HOST, PORT))
            sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            sock.connect((HOST, PORT))         
            print('reading')   
            data = sock.recv(1024)          
            print(data)
            sock.close()
            break
        except socket.error:
            pass
        time.sleep(2)

if __name__ == "__main__":
    main()
gavanderhoorn commented 7 years ago

Looks like everything is working as it should, yes.

re: bind vs connect: yes, that is indeed what was needed. I hadn't checked your Python program, actually.

Could you close this if you feel your issue has been resolved?

gavanderhoorn commented 7 years ago

If you're going to use Python, you could take a look at simple_message_py, which is a start of a Python implementation of the Simple Message protocol.

chrisgilmerproj commented 7 years ago

I can so far read the status from port 11002, but now I'm trying to confirm that I can write to port 11000. I've got a formatted JOINT_TRAJ_PT string but I can't seem to connect. Do you have an idea what might be going on here?

If you're going to use Python, you could take a look at simple_message_py, which is a start of a Python implementation of the Simple Message protocol.

This is great. I was just starting to implement this. Any chance you could make it work without twisted? I'm currently using sockets and threads to make a simple client.

chrisgilmerproj commented 7 years ago

Looks like I was able to connect after restarting the ROS program. Probably because I had either not properly closed previous sockets or because my delay at 1000 ms is still too short. I sent this message:

[64, 11, 2, 0, 1, 0.0, 0.8341555595397949, -0.5112152695655823, -0.059770114719867706, 0.3402792811393737, 0.0005963250296190382, 0.0, 0.0, 0.0, 0.0, 0.1, 5.0]

And got this response:

(56, 10, 3, 1, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)

Which I decoded into:

(56, 'JOINT_POSITION', 'SERVICE_REPLY', 'SUCCESS')
{'j0': 0.0,
 'j1': 0.0,
 'j2': 0.0,
 'j3': 0.0,
 'j4': 0.0,
 'j5': 0.0,
 'j6': 0.0,
 'j7': 0.0,
 'j8': 0.0,
 'j9': 0.0,
 'sequence': 1.0}

I'm pretty sure this isn't correct. I would expect it to take the trajectory point and move the robot. Is that not right? Is there something else I must do to have it move?

Edit: I found that its apparently supposed to return zeros: https://github.com/gavanderhoorn/fanuc_driver_exp/blob/master/ros_traj.kl#L542-L557

chrisgilmerproj commented 7 years ago

I got the status out directly after sending that command:

(40, 'STATUS', 'TOPIC', 'INVALID')
{'drives_powered': 'ON',
 'e_stopped': 'OFF',
 'error_code': 'OFF',
 'in_error': 'OFF',
 'in_motion': 'OFF',
 'mode': 'MANUAL',
 'motion_possible': 'OFF'}

This makes me think that motion_possible is the issue. The Tri-States I'm using looks like this:

TRI_STATES = {
    -1: "UNKNOWN",
    0: "OFF",  # Also encodes FALSE, DISABLED or LOW
    1: "ON",   # Also encodes TRUE, ENABLED or HIGH
}
chrisgilmerproj commented 7 years ago

One last question: I can't seem to run this code in AUTO mode. I've set it to start on COLD START and while ROS_STATE and ROS_TRAJ appear to be running, I cannot connect to them.

chrisgilmerproj commented 7 years ago

I'm seeing two errors pop up from the USER interface:

The first I think has to do with the BYTES_AHEAD call. My KAREL manual doesn't tell me what the error means though.

The second error I think has to do with writing the status message.

chrisgilmerproj commented 7 years ago

One last piece of info and then I'll leave you alone (I promise!).

I put the robot in auto mode, cleared all the faults (ie everything is green) and then ran my command:

Connecting to 192.168.1.201:11000 ...
(56, 'JOINT_POSITION', 'SERVICE_REPLY', 'SUCCESS')
{'j0': 0.0,
 'j1': 0.0,
 'j2': 0.0,
 'j3': 0.0,
 'j4': 0.0,
 'j5': 0.0,
 'j6': 0.0,
 'j7': 0.0,
 'j8': 0.0,
 'j9': 0.0,
 'sequence': 1}

I expected this output, but nothing actually happened. My software was setup to listen to the status stream immediately after the send, so I got this output:

Connecting to 192.168.1.201:11002 ...
(56, 'JOINT_POSITION', 'TOPIC', 'INVALID')
{'j0': -25.770153474296368,
 'j1': 47.79405405677288,
 'j2': -29.291146727289206,
 'j3': -3.4245565314409587,
 'j4': 19.496426646153726,
 'j5': 0.034664724152015484,
 'j6': 0.0,
 'j7': 0.0,
 'j8': 0.0,
 'j9': 0.0,
 'sequence': 0}
(56, 'JOINT_POSITION', 'TOPIC', 'INVALID')
{'j0': -25.770153474296368,
 'j1': 47.79405405677288,
 'j2': -29.291146727289206,
 'j3': -3.4245565314409587,
 'j4': 19.496426646153726,
 'j5': 0.034664724152015484,
 'j6': 0.0,
 'j7': 0.0,
 'j8': 0.0,
 'j9': 0.0,
 'sequence': 0}
(56, 'JOINT_POSITION', 'TOPIC', 'INVALID')
{'j0': -25.770153474296368,
 'j1': 47.79405405677288,
 'j2': -29.291146727289206,
 'j3': -3.4245565314409587,
 'j4': 19.496426646153726,
 'j5': 0.034664724152015484,
 'j6': 0.0,
 'j7': 0.0,
 'j8': 0.0,
 'j9': 0.0,
 'sequence': 0}
(56, 'JOINT_POSITION', 'TOPIC', 'INVALID')
{'j0': -25.770153474296368,
 'j1': 47.79405405677288,
 'j2': -29.291146727289206,
 'j3': -3.4245565314409587,
 'j4': 19.496426646153726,
 'j5': 0.034664724152015484,
 'j6': 0.0,
 'j7': 0.0,
 'j8': 0.0,
 'j9': 0.0,
 'sequence': 0}
(56, 'JOINT_POSITION', 'TOPIC', 'INVALID')
{'j0': -25.770153474296368,
 'j1': 47.79405405677288,
 'j2': -29.291146727289206,
 'j3': -3.4245565314409587,
 'j4': 19.496426646153726,
 'j5': 0.034664724152015484,
 'j6': 0.0,
 'j7': 0.0,
 'j8': 0.0,
 'j9': 0.0,
 'sequence': 0}
(56, 'JOINT_POSITION', 'TOPIC', 'INVALID')
{'j0': -25.770153474296368,
 'j1': 47.79405405677288,
 'j2': -29.291146727289206,
 'j3': -3.4245565314409587,
 'j4': 19.496426646153726,
 'j5': 0.034664724152015484,
 'j6': 0.0,
 'j7': 0.0,
 'j8': 0.0,
 'j9': 0.0,
 'sequence': 0}
(56, 'JOINT_POSITION', 'TOPIC', 'INVALID')
{'j0': -25.770153474296368,
 'j1': 47.79405405677288,
 'j2': -29.291146727289206,
 'j3': -3.4245565314409587,
 'j4': 19.496426646153726,
 'j5': 0.034664724152015484,
 'j6': 0.0,
 'j7': 0.0,
 'j8': 0.0,
 'j9': 0.0,
 'sequence': 0}
(56, 'JOINT_POSITION', 'TOPIC', 'INVALID')
{'j0': -25.770153474296368,
 'j1': 47.79405405677288,
 'j2': -29.291146727289206,
 'j3': -3.4245565314409587,
 'j4': 19.496426646153726,
 'j5': 0.034664724152015484,
 'j6': 0.0,
 'j7': 0.0,
 'j8': 0.0,
 'j9': 0.0,
 'sequence': 0}
(56, 'JOINT_POSITION', 'TOPIC', 'INVALID')
{'j0': -25.770153474296368,
 'j1': 47.79405405677288,
 'j2': -29.291146727289206,
 'j3': -3.4245565314409587,
 'j4': 19.496426646153726,
 'j5': 0.034664724152015484,
 'j6': 0.0,
 'j7': 0.0,
 'j8': 0.0,
 'j9': 0.0,
 'sequence': 0}
(56, 'JOINT_POSITION', 'TOPIC', 'INVALID')
{'j0': -25.770153474296368,
 'j1': 47.79405405677288,
 'j2': -29.291146727289206,
 'j3': -3.4245565314409587,
 'j4': 19.496426646153726,
 'j5': 0.034664724152015484,
 'j6': 0.0,
 'j7': 0.0,
 'j8': 0.0,
 'j9': 0.0,
 'sequence': 0}
(56, 'JOINT_POSITION', 'TOPIC', 'INVALID')
{'j0': -25.770153474296368,
 'j1': 47.79405405677288,
 'j2': -29.291146727289206,
 'j3': -3.4245565314409587,
 'j4': 19.496426646153726,
 'j5': 0.034664724152015484,
 'j6': 0.0,
 'j7': 0.0,
 'j8': 0.0,
 'j9': 0.0,
 'sequence': 0}
(40, 'STATUS', 'TOPIC', 'INVALID')
{'drives_powered': 'ON',
 'e_stopped': 'OFF',
 'error_code': 'OFF',
 'in_error': 'OFF',
 'in_motion': 'OFF',
 'mode': 'AUTO',
 'motion_possible': 'OFF'}

I'm still seeing the issue where motion_possible is set to false but this time in AUTO mode. Previously I had been using MANUAL while holding both the deadman switch and the SHIFT key. But that didn't seam to affect motion_possible. I was hoping that in AUTO I would see a change, but I did not.

I still get the two errors that I posted above, which is likely the culprit. Would love to hear what you think here.

gavanderhoorn commented 7 years ago

If you're going to use Python, you could take a look at simple_message_py, which is a start of a Python implementation of the Simple Message protocol.

This is great. I was just starting to implement this. Any chance you could make it work without twisted? I'm currently using sockets and threads to make a simple client.

Only the protocol implementation is using twisted. The msg (de)serialisers are using python-construct, which does not depend on it.

gavanderhoorn commented 7 years ago

One last question: I can't seem to run this code in AUTO mode. I've set it to start on COLD START and while ROS_STATE and ROS_TRAJ appear to be running, I cannot connect to them.

That is sort of a 'known issue' (but not necessarily with the code in fanuc_driver_exp): getting Karel programs with socket/network functionality in them to work reliably when started using the cold start infrastructure is tricky. I'll probably spend some time on this eventually, but haven't so far.

Also: I'm not entirely sure cold start can be used for programs that (ultimately) initiate robot motion.

gavanderhoorn commented 7 years ago

I'm seeing two errors pop up from the USER interface:

  • has msg err: -67212

67 is HOST, so that is a HOST-212 SM: Write Direction shut down by Peer.

Are you closing your sockets after sending commands or at 'strange' or unexpected times? The traj relay expects clients to open a socket and then keep that open, sending new commands whenever needed. It can handle (repeated) disconnects, but you will run into these errors. It just means it can't read your msgs or transmit a reply.

  • pub A err: -67208

That is a HOST-208 SM: Not Yet Connected. With a 'normal' client, this should not really happen, but apparently the socket fd that writ_sm000A_(..) is attempting to use is not yet connected to a server tag. I'm not sure what would cause that, but please check that you're not repeatedly opening and closing new connections (for each new JOINT_TRAJ_PT fi).

In my experience, socket msging on Fanucs is not too robust, and it's easy to get Server tags into strange states which only a controller reboot can reset. So avoiding that is a good thing.

gavanderhoorn commented 7 years ago

As for motion_possible in STATUS: ignore that for now. I've had another report that it sometimes reports FALSE when everything is actually working.

gavanderhoorn commented 7 years ago

I put the robot in auto mode, cleared all the faults (ie everything is green) and then ran my command:

[..]

I expected this output, but nothing actually happened.

Do you see RTRJ pushed pt 1 on the TP after you've sent the point? If not, I wouldn't expect the robot to move, as the TP side will still be waiting for the new traj pt.

Few things to check:

  1. Can you first try and make sure that you're not repeatedly opening and closing connections?
  2. make sure you are sending msgs in big-endian mode (but you should not have received a SUCCESS reply if you didn't already do this)

I think I'll need more info to debug why you're not seeing any motion. Can you perhaps send me the test program that you're using to interact with ROS_TRAJ? Then I can try to reproduce what you are seeing.

A wireshark trace (ie: a capture file) of the communication would also be helpful.

chrisgilmerproj commented 7 years ago

Here's the code I'm using:

https://gist.github.com/chrisgilmerproj/04f945459f1c33a71be4fd886b31a375

It's using python 3.5. When I run this code I see this output:

$  python ./fanuc.py
Connecting to 192.168.1.201:11002 ...
Connecting to 192.168.1.201:11000 ...
{}
{}
SUCCESS
{'mode': <ModeValues.MANUAL: 1>, 'drives_powered': <TriStates.OFF: 0>, 'error_code': 11003, 'in_motion': <TriStates.OFF: 0>, 'in_error': <TriStates.ON: 1>, 'motion_possible': <TriStates.OFF: 0>, 'e_stopped': <TriStates.OFF: 0>}
{'j2': -0.5112387537956238, 'j5': 0.0006108059897087514, 'j4': 0.3402748107910156, 'j6': 0.0, 'j1': 0.8341720104217529, 'j9': 0.0, 'j0': -0.4494251310825348, 'j8': 0.0, 'j3': -0.05976962298154831, 'j7': 0.0, 'sequence': 0}

You can see my attempt here is to move the fanuc Joint 0 from -25deg to 0deg.

This implementation also means I can keep the sockets open as long as I want, which is what I see from the Teach Pendant. Here is the output from the User panel:

RTRJ ack: 1
RTRJ pushed: 1

If I set up a series of commands they will just repeat that same set of message over and over. Yet I'm still not seeing any movement.

chrisgilmerproj commented 7 years ago

I should note that I do see the position registers change as the program is running and the values are exactly what I expect. This makes me think that the problem isn't ROS_TRAJ but maybe ROS_MOVESM.

If it helps, the Flag states for 1 and 2 are both ON when I look at them. The Register values are:

R[1] = 50
R[2] = 50
R[3] = 10
R[4] = 50

And I don't know how to tell if I'm ever reaching this line: https://github.com/gavanderhoorn/fanuc_driver_exp/blob/master/ros_movesm.ls#L51

chrisgilmerproj commented 7 years ago

Alright, I got it to move, but I'm not sure what's happening here. I'll try to explain how I got there. I am in MANUAL mode and nothing was running. I then took these steps:

  1. I held down the deadman switch
  2. I hit RESET to clear all errors
  3. I hit SELECT and selected the ROS program
  4. I pushed the SHIFT + FWD button, which started the program
  5. I then let go of the deadman switch, which gave me SRVO-003 error
  6. I ran my python program and nothing happened.

So next I tried this:

  1. I hit SELECT and selected the ROS program.
  2. I hit Enter and it took me to the ROS_MOVESM program at line 16
  3. I held down the deadman switch
  4. I hit RESET to clear all errors
  5. I pushed the SHIFT + FWD button and heard the high pitched sound of the controller box but this time I did not let go of either shift or the deadman switch
  6. I ran my python program and the robot started to move (And I saw the ROS_MOVESM move to line 29)

What was interesting to me is having to restart the program with SHIFT+FWD from the ROS_MOVESM program. Then it all appeared to work.

I also tried aborting all programs and running it again from MANUAL with the errors cleared, deadman switch + SHIFT + FWD and it appeared to work again.

However, I tried to run this in AUTO mode and can't seem to get it to work at all. Apparently I can only get this to run in MANUAL, which is not ideal. I'll try some more things and see what I can divine for you.

chrisgilmerproj commented 7 years ago

In AUTO mode I went to Menu -> NEXT -> STATUS -> PROGRAM and saw this:

1. Task number: 5
2. Task name:   ROS
3. Program:     ROS
4. Routing:     ROS
   Line number: 3
   Status:      PAUSED

How do you get a task out of a PAUSED state in AUTO mode? That's probably the issue I'm facing.

Edit: I tried just hitting the Green Cycle Start button on the controller and got two alarms:

jettan commented 7 years ago

One of the possibilities is that you have your TP set to run one step at a time. I have encountered similar situations where it turned out I had the step on |-|-> instead of |-> (or the other way around).

When unsure what is happening, abort all tasks using FCTN->ABORT ALL and try again.

chrisgilmerproj commented 7 years ago

I've confirmed that I'm not in STEP mode. Currently my TP shows the word STEP with a green background.

I'm using some instructions to try and run this in AUTO:

Set System/Config:

Set Prog Select:

I haven't configured UI signals to remotely run because I'd like to stay away from mapping EthernetIP I/O to UOP inputs. But this may be inevitable.

I thought for sure that setting up OTHER/UOP for the program select would mean the program would start up on a reboot of the control box. But this doesn't appear to be the case.

How would you set up the program to run in AUTO mode?

jettan commented 7 years ago

To run in AUTO mode I do the following:

So far it has not failed me.

chrisgilmerproj commented 7 years ago

@jettan - I may have a slightly different controller than you. I have the R30iB Mate Operator Panel. Still, this shouldn't be a big difference. I followed your steps and was unable to get it to work:

At this point the BUSY and RUN icons are Yellow, The remaining icons (Step, Hold, Fault, I/O, $PROD, and TCyc) are all Green. Also, the Cycle Start button on the Operator Panel is not illuminated, it is off.

I hit the Cycle Start button on the Operator Panel and nothing happened, but I got two errors:

This is why I've updated the Prog Select screen (Menu -> Setup -> Prog Select) to the values above. Still, I haven't seen anything run.

Any chance you have other variables set that I don't know about?

gavanderhoorn commented 7 years ago

I have a feeling this thread is derailing a bit. The title of the issue is "Cannot connect socket to controller". I think we've resolved that one, and the current problem(s) is (are) not really related to that.

@chrisgilmerproj: could I please ask you to open a new issue for the problems you are having starting this in auto mode?

It helps me keeping things on-topic and referable in the future.

chrisgilmerproj commented 7 years ago

@gavanderhoorn - Yep, you're right it has gone off the rails a bit and I apologize. I figured out the missing piece of information here, which if you point me in the right direction I'll add to your wiki in the appropriate place:

In the System/Config menu you need to set

In the Prog Select menu you need to set:

Now with the controller set to AUTO and the TP set to OFF you can hit the Cycle Start button and everything will work.

I successfully started in AUTO with these settings and was able to run the code above with no problems. If, however, something here seems odd to you I'm happy to discuss it outside the thread.

Thanks @gavanderhoorn and @jettan for all your help, I know this was a couple days of work on your part to get me running and if I can contribute back to documentation in any way I'd be happy to.

gavanderhoorn commented 7 years ago

As to the problems you were having before https://github.com/gavanderhoorn/fanuc_driver_exp/issues/2#issuecomment-294041263.

This is going to sound a little harsh, but have you had any training from/by Fanuc? In Manual mode you cannot let go of either the deadman switch or the shift key. If you do, any task that is either a task with motion groups locked or that wasn't configured to ignore certain events will be paused by the runtime system.

So your LR Mate not actually executing the motion is completely expected and explainable, even if pushed N is printed on the user menu.

As to why you can't get it to start in Auto mode: this sounds like a configuration issue with your controller on your end. I don't think it's something in fanuc_driver_exp, as I run it in Auto all the time, and @jettan does so as well. Note that I'm not saying that it couldn't be some issue with fanuc_driver_exp, but given that you're running into some other errors with starting programs in Auto I would think of a general controller configuration error first.

Have you tried starting a really simple TP / Karel program in Auto?

chrisgilmerproj commented 7 years ago

Just as a follow up, no, I have not had any FANUC training. I've only been handed a robot and all their documentation associated with the model. But I do have enough experience to know that in MANUAL you need the deadman + SHIFT key pressed (although it turns out the shift key on my machine is fickle and will sometimes un-depress even when you think you're holding it).

As for AUTO mode, as mentioned in my previous comment, it is simply not sufficient enough to tell users of the driver that they need to "just run in AUTO". There are a number of settings that need to be configured before the driver will actually work, which are not documented in the ROS wiki or in the github repo. Happily I figured this out and would love to contribute to making sure this documentation is available.

Again, I appreciate the help and time you've contributed to helping me get up to speed. Again, I'm happy to help do some documentation to help others just starting out with FANUC and the ROS driver. Just let me know and I'll try to repay your kindness.

gavanderhoorn commented 7 years ago

But I do have enough experience to know that in MANUAL you need the deadman + SHIFT key pressed

Ok, well in the first part of https://github.com/gavanderhoorn/fanuc_driver_exp/issues/2#issuecomment-294041263 you write that you are letting go of the deadman, and then notice that the robot is not moving. That confused me a little.

And just to make clear: I'm not judging in any way. The only reason I ask is because it let's me gauge what level of experience to expect and how to phrase certain things / how detailed my answers need to be.

Also: I sometimes get a little uncomfortable as I find that some users use my installation tutorials as their first 'intro' to Fanuc robots. That can become really dangerous, and is clearly not how things are intended. I purposefully leave out certain things as it should/could force people to reach out to their Fanuc support offices and have them clarify things for them or force them to read manuals.

it is simply not sufficient enough to tell users of the driver that they need to "just run in AUTO"

I'm not sure, but I don't think we actually say/write that anywhere? If we do, please link me to it and I will change that.

Again, I appreciate the help and time you've contributed to helping me get up to speed.

No problem, that is what we're here for.