lbr-stack / lbr_fri_ros2_stack

ROS 1/2 integration for KUKA LBR IIWA 7/14 and Med 7/14
https://lbr-stack.readthedocs.io/en/latest/
Apache License 2.0
121 stars 34 forks source link

FRI >= 2: Invalid commanded_joint_position state interface #181

Closed OmidRezayof closed 1 month ago

OmidRezayof commented 1 month ago

Hi, I'm using an lbr med 7 R800 with FRI med 3.0. Unfortunately I was not allowed by my manager to contribute the package to the community due to legal issues. I cloned everything as explained in the start up guide and followed the step by step guide to first run everything with FRI 2.5 using the simulators and things looked fine. After, I replaced the FRI-Client-SDK_Cpp.zip with my new zip file and had to make some changes to the LBRServer.java to meet the new syntax requirements (things have been changed slightly between FRI 2.5 and 3.0). I rebuilt the package as well after replacing the zip file. Finally LBRServer.java looks to run without errors on the smart pad. However, when I try to run the demos with the real hardware, the LBRServer app on the smart pad gives the error: "Timeout before FRI connection quality reached GOOD or EXCELLENT" after 10 seconds. I suspect that the UDP connection responsible for the FRI connection is not getting started thus getting this timeout error. (I have checked the connection between the robot and PC by pinging the robot.) I tried to look into the example codes but I wasn't able to find the place that the UDP connection is initiated (i.e. I'm not sure where the App and AsyncClient are being called when trying to tun "Joint Trajectory Controller" python demo.) I have also attached a picture from my terminal which looks like the UDP connection is not getting started for some reason.

I would greatly appreciate any help. Screenshot from 2024-05-28 17-35-02 Screenshot from 2024-05-28 17-35-39

mhubii commented 1 month ago

hi @OmidRezayof, thanks for sharing. Let me try to answer the questions that can be answered first:

Why is the connection not established?

Quick fix

Remove this line:

https://github.com/lbr-stack/lbr_fri_ros2_stack/blob/a752eb21271b42be7f6e45cbb333def87731b175/lbr_ros2_control/config/lbr_system_interface.xacro#L106

What causes this error

FRI 2.x and later removes the commanded_joint_position. This is likely your error:

[ros2_control_node-2] missing state interfaces:

Recent changes to this repository attempt to handle all FRI versions equally. This bug needs to be fixed still.

Where is the connection established?

The system interface runs the connection here:

https://github.com/lbr-stack/lbr_fri_ros2_stack/blob/a752eb21271b42be7f6e45cbb333def87731b175/lbr_ros2_control/src/system_interface.cpp#L206

The run_async method of lbr_fri_ros2::App creates an asynchronous thread

https://github.com/lbr-stack/lbr_fri_ros2_stack/blob/a752eb21271b42be7f6e45cbb333def87731b175/lbr_fri_ros2/src/app.cpp#L105

License / legal?

You can check KUKA's license note in the FRI files. From FRI 1.11 onward, these allow re-distribution. I don't know of 3.0

mhubii commented 1 month ago

hi @fredRocs, could you please do me a massive favor and try connecting your robot? See if you get a similar error?

OmidRezayof commented 1 month ago

Thanks @mhubii for your reply! I tried removing that line from the file you sent, rebuilt the whole lbr-stack workspace and now, I've got new errors. Any idea about this? image

mhubii commented 1 month ago

Sorry about that. Will fix this proper soon.

See the error: expected 7, got 6?

There is a line in the code that requires changing to 6:

https://github.com/lbr-stack/lbr_fri_ros2_stack/blob/humble/lbr_ros2_control/include/lbr_ros2_control/system_interface.hpp#L71

This hiccups is caused by changed code. Thank you again for this valuable feedback @OmidRezayof !

If you could create a PR to fix these sooner, I am happy to review.

OmidRezayof commented 1 month ago

No need to apologize @mhubii I really appreciate your support. It looks like I was finally able to open the UDP channel after changing the required number of state interfaces to 6, but now it's complaining about not being able to find the robot I guess? What I do is I first start the LBRServer with the recommended specs and then quickly run the launch file as in demo.

PS I have pinged the robot and it seems to be fine. Screenshot from 2024-05-29 09-42-06

fredRocs commented 1 month ago

hi @fredRocs, could you please do me a massive favor and try connecting your robot? See if you get a similar error?

It seems to work fine on my side when I run ros2 launch lbr_bringup bringup.launch.py sim:=false ctrl:=joint_trajectory_controller model:=med14. The signal stays EXCELLENT

OmidRezayof commented 1 month ago

Just wanted to add something, when I run "colcon build --symlink-install" in my workspace, I get this warning. I didn't think it would be of great importance so I ignored it. Maybe this is the problem? image

OmidRezayof commented 1 month ago

Dear @mhubii I tried to fork the repo and send a PR so you can see my codes. You can see the edited codes under the branch FRI3 under my forked repo. Please accept my apologies since I'm not a pro in Github and may have made mistakes in this process (Github can be very confusing!!)

The final comments that I made on this issue thread were:

  1. Apparently after changing the required numbers for state interface to 6, the UDP socket is opened, however now it cannot find the robot on the UDP channel.
  2. I also reported the fact that I'm having warnings when building the packages in my lbr-stack workspace which I'm not sure if they're contributing to the issue we are discussing.
mhubii commented 1 month ago

okay great. Thanks for the PR.

In your version of the LBRServer Java application, what IP do you set for the FRI connection? When you configure your FRI connection, there should be something similar to:

fri_configuration_ = FRIConfiguration.createRemoteConfiguration(lbr_, "172.31.1.148");

This IP should match your computer's IP address. For example, you can:

OmidRezayof commented 1 month ago

Yes, I'm indeed using the same IP (.148) for my PC and am choosing the same on the SmartPad. Still not sure why the robot cannot find the communication line. It's hard for me to tell if you have made any changes to the branch I made and did a PR. Have you made any changes that can possibly solve this?

mhubii commented 1 month ago

no problem. Your changes are fine.

Netmask

One thing is to check the Netmask. Your netmask on your computer should match StationSetup.cat -> Configuration -> Netmask for 172... connection on the controller.

Firewall

Some people have reported firewall issues in the past:

https://github.com/lbr-stack/lbr_fri_ros2_stack/issues/67#issuecomment-1479794623

Demo applications

If an updated netmask does not work, I would suggest to try and run KUKA's demo applications with the raw FRI (no LBRServer, no ROS 2 integration) so we can make sure it runs in principle. You can find some simple instructions here:

To run them, check the IP in the LBRJointSineOverlay.java application. Make sure it matches your computer's.

Please let me know if any of these help @OmidRezayof

OmidRezayof commented 1 month ago

On sunrise, I can see 2 subnet masks: Subnet mask > FFFF0000 KONI > Subnet mask > FFFFFF00

On my PC I have also tried using net masks of 255.255.0.0 and 255.255.255.0

Still connection not established. image Please let me know if you see anything wrong.

I have tried disabling firewall; didn't work either.

I will be trying their own example applications soon.

Update: I tried to first build the fri package as in here and I got the error: CMake Error at CMakeLists.txt:57 (message): Expected FRIClient ersion 2.5, found: . So it's probably a build problem for the fri package. I'm not sure how to fix this especially since it looks like it cannot find any version numbers (?.?). I have only replaced the .zip file with my fri 3.0 package, and have changed the LBRServer app on another windows machine within sunrise. Any idea?

I have also made a new branch for fri-3.0 with the codes in it and sent a PR so you can see and edit the codes. I really and greatly appreciate all of your support and will be waiting to hear from @mhubii back. Thanks again.

mhubii commented 1 month ago

Netmask

Your netmask looks good.

For the different ethernet connections:

Expected 2.5, found ...

To check all the FRI versions in their .zip file, we added a version check in CMake:

https://github.com/lbr-stack/fri/blob/269ea63e5566a06cf11a768db61abf8442d96e70/CMakeLists.txt#L56

We extract the FRI version through a regular expression:

https://github.com/lbr-stack/fri/blob/269ea63e5566a06cf11a768db61abf8442d96e70/CMakeLists.txt#L50

This is simply a check, to verify contributors upload the claimed version. In your case (for testing purposes), you can simply remove this check or put 3.0. I believe you have already done this in your code:

https://github.com/OmidRezayof/fri/blob/f7baad7bb2cb6d990a0943d85becce6ac206794b/CMakeLists.txt#L56

Just make sure you use your .zip file for FRI 3.0.

Out of curiosity, can you share KUKA's license note from one of the header files?

OmidRezayof commented 1 month ago

@mhubii I finally got the permission from my PI to share the fri 3.0. I'm not sure why I couldn't find my updated fri-3.0 branch on lbr-stack/fri but I uploaded the sdk under my forked repo under the branch fri-3.0 (here) which you should be able to see. Let me know if you see anything that can fix the build issue.

PS the license wasn't the issue I just had to get the permission from my boss.

mhubii commented 1 month ago

okay FRI 2.7 (believe not 3.0) is now live. One thing, I think I re-based your PR https://github.com/lbr-stack/lbr_fri_ros2_stack/pull/182

This might remove your contribution from the git history.

Could you please re-open a new PR with the same changes against this branch:

https://github.com/lbr-stack/lbr_fri_ros2_stack/tree/dev-humble-state-interface-fix

Want to make sure your contribution is rightfully attributed.

OmidRezayof commented 1 month ago

Hmm, not sure why 2.7 because it has been mentioned as 3.0 on all their documentations and everything. I will try it with the new codes and see if it fixes the build problems. And for the new PR, do I have to do it just for the lbr_fri_ros2_stack or also for fri-2.7?

mhubii commented 1 month ago

For lbr_fri_ros2_stack. It is strange because in the source code it says version 2.7.0.

If you check any of the headers, you will find: \version {2.7.0}

I'll simply mention you as a maintainer in the fri repo to compensate. But lbr_fri_ros2_stack has greater visibility so more important either way.

OmidRezayof commented 1 month ago

@mhubii I tried running the example application and I was getting all kinds of different errors on my PC every time I run the LBRJointSineOverlay.java on the smartpad: !!decoding error on Monitor message: invalid wire_type!! !!decoding error on Monitor message: missing required field!! !!decoding error on Monitor message: end-of-stream!! !!decoding error on Monitor message: parent stream too short!! !!decoding error on Monitor message: end-of-stream!!

I suspect this is an issue with the new released fri package cuz when I downgraded to Sunrise 2.6 and fri 2.5, I didn't have this problem. I was able to succesfully establish the ROS/fri connection and move the joints using demo. As of now, I think I will continue using fri 2.5 and if any updates after 3.0 was made by KUKA, I can maybe try to see if I can get it to work.
I just wanted to thank you a lot for the troubleshooting @mhubii .

mhubii commented 1 month ago

Ah okay great, well this issue here still needs fixing, will merge your contribution asap.

I would contact KUKA because there appears to be some version mismatch between the FRI client SDK (2.7.0) and what is on the controller. This might be a bug on their end?

OmidRezayof commented 1 month ago

Sorry for my late reply. I'm not super sure what is exactly the version mismatch that you are talking about. I do see the version 2.7.0 on the header files but that could be potentially the Sunrise.workbench version, since we were actually using Sunrise.workbench 2.7.0, but I'm pretty sure all the documents that we have, even the file name of the fri option package we have received have been labeled fri-3.0.

It could be potentially that the CMake support codes that you have written for fri<=2.x is not suitable for compiling fri-3.0? I have no idea but I wasn't able to run KUKA's example fri codes using the fri-3.0 SDK so I'm assuming it's either a KUKA issue or an issue from the CMake compatibility codes.

mhubii commented 1 month ago

thank you again @OmidRezayof for your contribution! You should now be listed for your efforts.

I am closing this as the initial issue was resolved. As for the FRI, I opened a new issue here: https://github.com/lbr-stack/fri/issues/29

Please make sure to pull the latest changes. When running, make sure to specify your FRI version here

https://github.com/lbr-stack/lbr_fri_ros2_stack/blob/77f6979b3bf19596286d723598de6c4419d2ea48/lbr_ros2_control/config/lbr_system_parameters.yaml#L4

Happy to dig further into FRI 3 / 2.7, as it appears this is the cause of the connection issues (either CMake or what not).

Wish you a good weekend!