Yaskawa-Global / motoros2

ROS 2 (rcl, rclc & micro-ROS) node for MotoPlus-compatible Yaskawa Motoman robot controllers
89 stars 14 forks source link

Support Iron Irwini #52

Open gavanderhoorn opened 1 year ago

gavanderhoorn commented 1 year ago

Tasks:

gavanderhoorn commented 1 year ago

If I'm not mistaken, Iron support has just been released upstream: micro-ROS Iron: 4.1.0.

gavanderhoorn commented 10 months ago

@ted-miller: I believe you suggested adding this to the 0.2.0 milestone in #120?


Edit: oh wait, I see I already added it. Nm.

gavanderhoorn commented 6 months ago

Something to be aware of: https://github.com/ros2/rmw_dds_common/pull/68.

Almost starts to pay off to implement some sort of CI that checks message compatibility between different ROS 2 distributions.

ted-miller commented 3 months ago

Doesn't build yet. But I'm working on this: https://github.com/Yaskawa-Global/motoros2/tree/iron_wip

ted-miller commented 3 months ago

update VS project with Iron targets for all supported controllers

This kinda sucks to do and the current method does not scale well. I'm assuming the cmake-passthrough mode of M+SDK would make compilation easier. But I'm not sure how intellisense would behave without the stuff that was manually added.

ted-miller commented 3 months ago

Very quick/preliminary testing. But I wanted to write it down before I leave.

AFAICT, libmicroros seems OK. (More testing needed)

Other things I noticed using the Iron version

ted-miller commented 3 months ago

I got the invalid job alarm. Then I deleted the job, rebooted, and let it regen a new job. I then got the alarm again. Something fishy here

I did it yet again, and the alarm didn't occur. So, it seems that I just failed to delete the job. But I'm confused, because I was 90% sure that I did delete the job. Perhaps I just did a cpu-reset too quickly after the deletion???


When attempting to use any of the motoros2_interfaces messages, I get Segmentation fault (core dumped). As I'm typing this, I realize that I sourced a humble version of these messages and never rebuilt them for iron. So it's probably that.

Yeah, I rebuilt the messages for iron and it worked.


I had one terminal open that was echoing robot_status and forgot to stop the echo. It was just running in the background.

I opened another terminal and started making service calls. The majority of them "worked", but returned an error message.

ros2 service call /read_group_io motoros2_interfaces/srv/ReadGroupIO "{address: 7001}"
2024-03-26 13:14:07.410 [RTPS_TRANSPORT_SHM Error] Failed init_port fastrtps_port7411: open_and_lock_file failed -> Function open_port_internal
requester: making request: motoros2_interfaces.srv.ReadGroupIO_Request(address=7001)

response:
motoros2_interfaces.srv.ReadGroupIO_Response(result_code=0, message='Success', success=True, value=192)

But if I go back to the original terminal, I can make those service calls without any issue.

It seems that fastrtps might not be threadsafe. More testing is needed to submit a ticket to eProsima.

A quick google search said that someone got a similar error with a version mismatch.

My docker container:

ros2 doctor --report | grep fastrtps
2024-03-26 13:17:01.132 [RTPS_TRANSPORT_SHM Error] Failed init_port fastrtps_port7411: open_and_lock_file failed -> Function open_port_internal
rosidl_typesupport_fastrtps_cpp           : latest=3.0.2, local=3.0.2
rmw_fastrtps_shared_cpp                   : latest=7.1.3, local=7.1.3
rosidl_dynamic_typesupport_fastrtps       : latest=0.0.2, local=0.0.2
rosidl_typesupport_fastrtps_c             : latest=3.0.2, local=3.0.2
rmw_fastrtps_cpp                          : latest=7.1.3, local=7.1.3
fastrtps_cmake_module                     : latest=3.0.2, local=3.0.2
middleware name    : rmw_fastrtps_cpp

I'm entirely not sure which of our packages we would want to compare that too. I guess it would be the Agent, since it's doing the translation between fastrtps and microxrcedds.

The Agent container:

ros2 doctor --report | grep fastrtps
/opt/ros/humble/lib/python3.10/site-packages/ros2doctor/api/__init__.py: 154: UserWarning: Fail to call QoSCompatibilityReport class functions.
rosidl_typesupport_fastrtps_cpp           : latest=2.2.2, local=2.2.2
rmw_fastrtps_shared_cpp                   : latest=6.2.6, local=6.2.5
rosidl_typesupport_fastrtps_c             : latest=2.2.2, local=2.2.2
rmw_fastrtps_cpp                          : latest=6.2.6, local=6.2.5
fastrtps_cmake_module                     : latest=2.2.2, local=2.2.2
middleware name    : rmw_fastrtps_cpp

Definitely some differences. Don't know if they're important.


I was surprised that control_msgs wasn't included by default in the Iron docker image. I feel like it was included in Humble.

I had to manually clone and build this.


After running the FJT script, I got this message in my client.

control_msgs.action.FollowJointTrajectory_Result(error_code=-500301, error_string='Final position was outside tolerance. Check robot safety-limits that could be inhibiting motion. [group_1/joint_1: 0.0000 deviation] [group_1/joint_5: 0.0000 deviation]')

This is very confusing.

Looking at the logging script, I think it might be due to the time tolerance rather than position.

[1711461114.98510480] [192.168.1.31:50724]: 2024-03-26 13:51:54.983911 Trajectory complete
[1711461114.98533487] [192.168.1.31:50724]: 2024-03-26 13:51:54.983911 FJT using DEFAULT goal time tolerance: 500000000 ns
[1711461114.98539233] [192.168.1.31:50724]: 2024-03-26 13:51:54.984111 FJT action failed
[1711461114.98543978] [192.168.1.31:50724]: 2024-03-26 13:51:54.984111 Final position was outside tolerance. Check robot safety-limits that could be inhibiting motion. [group_1/joint_1: 0.0000 deviation] [group_1/joint_5: 0.0000 deviation]

I might need to open a new issue for this.

ted-miller commented 3 months ago

Except for the misc notes above, it seems to be working. I was able to run the topics, services, and both types of motion.

gavanderhoorn commented 3 months ago

I got the invalid job alarm. Then I deleted the job, rebooted, and let it regen a new job. I then got the alarm again. Something fishy here

I did it yet again, and the alarm didn't occur. So, it seems that I just failed to delete the job. But I'm confused, because I was 90% sure that I did delete the job. Perhaps I just did a cpu-reset too quickly after the deletion???

I seem to remember having run into this as well, and then we added this to the Alarm: 8014[1] FAQ:

If the alarm is posted again after restarting the controller, make sure to allow sufficient time for the controller to properly delete the job (flushing changes to the file system may take some time). Allow for at least 20 seconds between deleting the job and restarting the controller.

Could be what you experienced.

When attempting to use any of the motoros2_interfaces messages, I get Segmentation fault (core dumped). As I'm typing this, I realize that I sourced a humble version of these messages and never rebuilt them for iron. So it's probably that.

Yeah, I rebuilt the messages for iron and it worked.

I would've expected that yes.

Key things have changed between Humble and Iron. Especially https://github.com/ros2/rmw_dds_common/pull/68 breaks everything between those versions.

I had one terminal open that was echoing robot_status and forgot to stop the echo. It was just running in the background.

I opened another terminal and started making service calls. The majority of them "worked", but returned an error message.

[..]

A quick google search said that someone got a similar error with a version mismatch.

[..]

Definitely some differences. Don't know if they're important.

The set of packages you've used to built M+ libmicroros is old. It's from around July last year. I would suspect that to be the cause here before anything else.

It's also slightly surprising to see the SHM transport mentioned in the error message. AFAIK, that's explicitly disabled in the Agent Docker image.

I was surprised that control_msgs wasn't included by default in the Iron docker image. I feel like it was included in Humble.

I had to manually clone and build this.

Didn't apt install ros-iron-control-msgs work?

Slightly off-topic, but:

Looking at the logging script, I think it might be due to the time tolerance rather than position.

[1711461114.98510480] [192.168.1.31:50724]: 2024-03-26 13:51:54.983911 Trajectory complete
[..]

is your debug listener out-of-date? That formatting (the date duplication) was fixed a long time ago.

ted-miller commented 3 months ago

The set of packages you've used to built M+ libmicroros is old. It's from around July last year. I would suspect that to be the cause here before anything else.

That's fair.

I had to manually clone and build this.

Didn't apt install ros-iron-control-msgs work?

I didn't know that had an apt package.

is your debug listener out-of-date?

Extremely. I keep forgetting to update it.

ted-miller commented 3 months ago

The set of packages you've used to built M+ libmicroros is old. It's from around July last year. I would suspect that to be the cause here before anything else.

Do you have a trick to updating this? I'm just going through the repos list and manually checking for newer versions. I'm assuming the following repos are applicable:

gavanderhoorn commented 3 months ago

I have a tool which checks our .repos against upstream. But part of it is manual work, yes.

I'll take a look tomorrow.


Edit:

Inconsequential changes

I always either update everything, or nothing.

Things may look inconsequential, but specific releases have been tested against/with each other. To avoid having to vet things ourselves again at all levels, it's best to stick (or at least start) with whatever upstream has tagged as a release.

ted-miller commented 3 months ago

I pulled the latest Micro-XRCE-DDS-Client and applied the custom changes for a quick test.

I was not able to reproduce the original issue after updating. But I can't say with any certainty that the problem was fixed since I didn't reproduce the original issue multiple times. (hindsight being 20/20 and all...)

gavanderhoorn commented 3 months ago

Combined with the latest test build of Iron M+ libmicroros (shared with you in the other thread), https://github.com/Yaskawa-Global/motoros2/compare/main...gavanderhoorn:iron_no_remap_rules should build.


Edit: this is of course just a work-around for now.

ted-miller commented 2 weeks ago

@gavanderhoorn Is Iron still of any benefit? I've asked @jimmy-mcelwain to start looking at this. But I see that microros now supports Jazzy.

gavanderhoorn commented 2 weeks ago

Yes. We should support Humble, Iron and Jazzy.

Humble and Jazzy are LTS, so will be around for quite some time.

Iron is not an LTS, but should prepare us for Jazzy.

gavanderhoorn commented 1 week ago

Updates to micro_ros_motoplus: Yaskawa-Global/micro_ros_motoplus#3.