Support Iron Irwini - Githubissues

gavanderhoorn commented 1 year ago

Tasks:

[x] update VS project with Iron targets for all supported controllers (https://github.com/Yaskawa-Global/motoros2/compare/main...iron_wip, @ted-miller)
[ ] update M+ libmicroros build infra with Iron support
- updates to micro_ros_motoplus: Yaskawa-Global/micro_ros_motoplus#3
- updates to _buildscripts
[ ] find solution for broken CLI argument parsing (MotoROS2 does not have a CLI) due to upstream micro-ROS / rcl changes
- work-around for now: https://github.com/Yaskawa-Global/motoros2/compare/main...gavanderhoorn:iron_no_remap_rules
[x] update documentation: (https://github.com/Yaskawa-Global/motoros2/pull/303)
- [x] General requirements
- [ ] Extracting the files
- [x] Agent
- [x] Only Fast-DDS is supported (still)

gavanderhoorn commented 1 year ago

If I'm not mistaken, Iron support has just been released upstream: micro-ROS Iron: 4.1.0.

gavanderhoorn commented 1 year ago

@ted-miller: I believe you suggested adding this to the 0.2.0 milestone in #120?

Edit: oh wait, I see I already added it. Nm.

gavanderhoorn commented 11 months ago

Something to be aware of: https://github.com/ros2/rmw_dds_common/pull/68.

Almost starts to pay off to implement some sort of CI that checks message compatibility between different ROS 2 distributions.

ted-miller commented 8 months ago

Doesn't build yet. But I'm working on this: https://github.com/Yaskawa-Global/motoros2/tree/iron_wip

ted-miller commented 8 months ago

update VS project with Iron targets for all supported controllers

This kinda sucks to do and the current method does not scale well. I'm assuming the cmake-passthrough mode of M+SDK would make compilation easier. But I'm not sure how intellisense would behave without the stuff that was manually added.

ted-miller commented 8 months ago

Very quick/preliminary testing. But I wanted to write it down before I leave.

AFAICT, libmicroros seems OK. (More testing needed)

Other things I noticed using the Iron version

During setup, I had to set S2C1402=3 and reboot twice. This was probably a mistake on my part, but I really thought I set it right on the first attempt.
I got the invalid job alarm. Then I deleted the job, rebooted, and let it regen a new job. I then got the alarm again. Something fishy here.
Looking at joint_states, the cross-axis-coupling compensation routine isn't perfect. I'm seeing slight fluctuations in T axis when jogging others. But this is likely due to rounding errors on our part and could probably be fixed with https://github.com/Yaskawa-Global/motoros2/issues/199.
When attempting to use any of the motoros2_interfaces messages, I get Segmentation fault (core dumped). As I'm typing this, I realize that I sourced a humble version of these messages and never rebuilt them for iron. So it's probably that.

ted-miller commented 8 months ago

I got the invalid job alarm. Then I deleted the job, rebooted, and let it regen a new job. I then got the alarm again. Something fishy here

I did it yet again, and the alarm didn't occur. So, it seems that I just failed to delete the job. But I'm confused, because I was 90% sure that I did delete the job. Perhaps I just did a cpu-reset too quickly after the deletion???

When attempting to use any of the motoros2_interfaces messages, I get Segmentation fault (core dumped). As I'm typing this, I realize that I sourced a humble version of these messages and never rebuilt them for iron. So it's probably that.

Yeah, I rebuilt the messages for iron and it worked.

I had one terminal open that was echoing robot_status and forgot to stop the echo. It was just running in the background.

I opened another terminal and started making service calls. The majority of them "worked", but returned an error message.

ros2 service call /read_group_io motoros2_interfaces/srv/ReadGroupIO "{address: 7001}"
2024-03-26 13:14:07.410 [RTPS_TRANSPORT_SHM Error] Failed init_port fastrtps_port7411: open_and_lock_file failed -> Function open_port_internal
requester: making request: motoros2_interfaces.srv.ReadGroupIO_Request(address=7001)

response:
motoros2_interfaces.srv.ReadGroupIO_Response(result_code=0, message='Success', success=True, value=192)

But if I go back to the original terminal, I can make those service calls without any issue.

It seems that fastrtps might not be threadsafe. More testing is needed to submit a ticket to eProsima.

A quick google search said that someone got a similar error with a version mismatch.

My docker container:

ros2 doctor --report | grep fastrtps
2024-03-26 13:17:01.132 [RTPS_TRANSPORT_SHM Error] Failed init_port fastrtps_port7411: open_and_lock_file failed -> Function open_port_internal
rosidl_typesupport_fastrtps_cpp           : latest=3.0.2, local=3.0.2
rmw_fastrtps_shared_cpp                   : latest=7.1.3, local=7.1.3
rosidl_dynamic_typesupport_fastrtps       : latest=0.0.2, local=0.0.2
rosidl_typesupport_fastrtps_c             : latest=3.0.2, local=3.0.2
rmw_fastrtps_cpp                          : latest=7.1.3, local=7.1.3
fastrtps_cmake_module                     : latest=3.0.2, local=3.0.2
middleware name    : rmw_fastrtps_cpp

I'm entirely not sure which of our packages we would want to compare that too. I guess it would be the Agent, since it's doing the translation between fastrtps and microxrcedds.

The Agent container:

ros2 doctor --report | grep fastrtps
/opt/ros/humble/lib/python3.10/site-packages/ros2doctor/api/__init__.py: 154: UserWarning: Fail to call QoSCompatibilityReport class functions.
rosidl_typesupport_fastrtps_cpp           : latest=2.2.2, local=2.2.2
rmw_fastrtps_shared_cpp                   : latest=6.2.6, local=6.2.5
rosidl_typesupport_fastrtps_c             : latest=2.2.2, local=2.2.2
rmw_fastrtps_cpp                          : latest=6.2.6, local=6.2.5
fastrtps_cmake_module                     : latest=2.2.2, local=2.2.2
middleware name    : rmw_fastrtps_cpp

Definitely some differences. Don't know if they're important.

I was surprised that control_msgs wasn't included by default in the Iron docker image. I feel like it was included in Humble.

I had to manually clone and build this.

After running the FJT script, I got this message in my client.

control_msgs.action.FollowJointTrajectory_Result(error_code=-500301, error_string='Final position was outside tolerance. Check robot safety-limits that could be inhibiting motion. [group_1/joint_1: 0.0000 deviation] [group_1/joint_5: 0.0000 deviation]')

This is very confusing.

Looking at the logging script, I think it might be due to the time tolerance rather than position.

[1711461114.98510480] [192.168.1.31:50724]: 2024-03-26 13:51:54.983911 Trajectory complete
[1711461114.98533487] [192.168.1.31:50724]: 2024-03-26 13:51:54.983911 FJT using DEFAULT goal time tolerance: 500000000 ns
[1711461114.98539233] [192.168.1.31:50724]: 2024-03-26 13:51:54.984111 FJT action failed
[1711461114.98543978] [192.168.1.31:50724]: 2024-03-26 13:51:54.984111 Final position was outside tolerance. Check robot safety-limits that could be inhibiting motion. [group_1/joint_1: 0.0000 deviation] [group_1/joint_5: 0.0000 deviation]

I might need to open a new issue for this.

ted-miller commented 8 months ago

Except for the misc notes above, it seems to be working. I was able to run the topics, services, and both types of motion.

gavanderhoorn commented 8 months ago

I got the invalid job alarm. Then I deleted the job, rebooted, and let it regen a new job. I then got the alarm again. Something fishy here

I did it yet again, and the alarm didn't occur. So, it seems that I just failed to delete the job. But I'm confused, because I was 90% sure that I did delete the job. Perhaps I just did a cpu-reset too quickly after the deletion???

I seem to remember having run into this as well, and then we added this to the Alarm: 8014[1] FAQ:

If the alarm is posted again after restarting the controller, make sure to allow sufficient time for the controller to properly delete the job (flushing changes to the file system may take some time). Allow for at least 20 seconds between deleting the job and restarting the controller.

Could be what you experienced.

When attempting to use any of the motoros2_interfaces messages, I get Segmentation fault (core dumped). As I'm typing this, I realize that I sourced a humble version of these messages and never rebuilt them for iron. So it's probably that.

Yeah, I rebuilt the messages for iron and it worked.

I would've expected that yes.

Key things have changed between Humble and Iron. Especially https://github.com/ros2/rmw_dds_common/pull/68 breaks everything between those versions.

I had one terminal open that was echoing robot_status and forgot to stop the echo. It was just running in the background.

I opened another terminal and started making service calls. The majority of them "worked", but returned an error message.

[..]

A quick google search said that someone got a similar error with a version mismatch.

[..]

Definitely some differences. Don't know if they're important.

The set of packages you've used to built M+ libmicroros is old. It's from around July last year. I would suspect that to be the cause here before anything else.

It's also slightly surprising to see the SHM transport mentioned in the error message. AFAIK, that's explicitly disabled in the Agent Docker image.

I was surprised that control_msgs wasn't included by default in the Iron docker image. I feel like it was included in Humble.

I had to manually clone and build this.

Didn't apt install ros-iron-control-msgs work?

Slightly off-topic, but:

Looking at the logging script, I think it might be due to the time tolerance rather than position.
[1711461114.98510480] [192.168.1.31:50724]: 2024-03-26 13:51:54.983911 Trajectory complete
[..]

is your debug listener out-of-date? That formatting (the date duplication) was fixed a long time ago.

ted-miller commented 8 months ago

The set of packages you've used to built M+ libmicroros is old. It's from around July last year. I would suspect that to be the cause here before anything else.

That's fair.

I had to manually clone and build this.

Didn't apt install ros-iron-control-msgs work?

I didn't know that had an apt package.

is your debug listener out-of-date?

Extremely. I keep forgetting to update it.

ted-miller commented 8 months ago

The set of packages you've used to built M+ libmicroros is old. It's from around July last year. I would suspect that to be the cause here before anything else.

Do you have a trick to updating this? I'm just going through the repos list and manually checking for newer versions. I'm assuming the following repos are applicable:

eProsima/Micro-CDR - No changes
eProsima/Micro-XRCE-DDS-Client (with private modifications) - Update needed
micro-ROS/rmw-microxrcedds.git - Inconsequential changes
micro-ROS/micro-ROS-Agent - There is a newer version of this involving "thread". But it really seems inconsequential.

gavanderhoorn commented 8 months ago

I have a tool which checks our .repos against upstream. But part of it is manual work, yes.

I'll take a look tomorrow.

Edit:

Inconsequential changes

I always either update everything, or nothing.

Things may look inconsequential, but specific releases have been tested against/with each other. To avoid having to vet things ourselves again at all levels, it's best to stick (or at least start) with whatever upstream has tagged as a release.

ted-miller commented 8 months ago

I pulled the latest Micro-XRCE-DDS-Client and applied the custom changes for a quick test.

I was not able to reproduce the original issue after updating. But I can't say with any certainty that the problem was fixed since I didn't reproduce the original issue multiple times. (hindsight being 20/20 and all...)

gavanderhoorn commented 8 months ago

Combined with the latest test build of Iron M+ libmicroros (shared with you in the other thread), https://github.com/Yaskawa-Global/motoros2/compare/main...gavanderhoorn:iron_no_remap_rules should build.

Edit: this is of course just a work-around for now.

ted-miller commented 5 months ago

@gavanderhoorn Is Iron still of any benefit? I've asked @jimmy-mcelwain to start looking at this. But I see that microros now supports Jazzy.

gavanderhoorn commented 5 months ago

Yes. We should support Humble, Iron and Jazzy.

Humble and Jazzy are LTS, so will be around for quite some time.

Iron is not an LTS, but should prepare us for Jazzy.

gavanderhoorn commented 5 months ago

Updates to micro_ros_motoplus: Yaskawa-Global/micro_ros_motoplus#3.

ted-miller commented 2 months ago

find solution for broken CLI argument parsing (MotoROS2 does not have a CLI) due to upstream micro-ROS / rcl changes

I'm of the opinion that we just preprocessor it out for the time being. I think this issue has been sitting here too long.

update documentation: Extracting the files

I'm not sure what this means

Jimmy and I have both tested the current state. Everything is working well (with remap rules not supported).

jimmy-mcelwain commented 1 month ago

I tried doing a full build (RCL_MICROROS=OFF). When I do so and load it onto the controller, it is missing the definition for the symbol strtoll. If I add a definition, then I am able to properly load it onto the controller and boot. However, as soon as I try to connect with my desktop client, it errors out at this line. The rcl_ret_t here is just RCL_RET_ERROR, nothing descriptive. I don't know what the problem is. I did notice that this is the same as #292, which. It seems to often fail on this step if there is some sort of communication error. I am certain, though, that I am not running into the same network problem as the poster there. I can keep the code exactly the same on iron wip with the exception of the definition of RCL_MICROROS_COMPLETE_IMPL, and it will succeed with the libmicroros build with RCL_MICROROS=ON and fail with the build with RCL_MICROROS=OFF. Since it only errors out upon connection with the client, I thought it could be an rmw problem, but I am having trouble figuring it out. I'm guessing that something is incomplete/wrong in my libmicroros.

@ted-miller suggested that I manually perform the remapping through the config file (change the string that is passed in when the publisher etc. is initialized), so I will probably do that sometime since I haven't been able to figure out how to remap using the intended mechanism.

gavanderhoorn commented 1 month ago

I'm only one voice here, but:

@ted-miller suggested that I manually perform the remapping through the config file (change the string that is passed in when the publisher etc. is initialized), so I will probably do that sometime since I haven't been able to figure out how to remap using the intended mechanism.

please don't do this.

Using parameters for topic/service/action names is a bit of an anti-pattern.

The real solution would be to address this one: https://github.com/ros2/rcl/issues/998, or figure out why a full build doesn't boot.

jimmy-mcelwain commented 3 weeks ago

I still haven't been able to figure out why a full build won't boot, but I did get a build that got remapping to work. I created a finer-grain selection and decoupled the argument parsing from RCL_MICROROS, so you can have RCL_MICROROS=ON and a new option, RCL_REMAPPING_ENABLED=ON at the same time.

I have forks at jimmy-mcelwain/micro_ros_motoplus@5605ddc4aa7a5339d2843e909f098f92e97cc912, jimmy-mcelwain/micro_ros_rcl@26d41f2485c8db60fa83418393e296096009d480, and jimmy-mcelwain/motoros2@6f0a94533d708e55d5ecce5466a63ba1a634f61d. Forking is disabled for micro_ros_motoplus_buildscripts, so I just put it in another branch based on one that Ted had access to that I didn't.

I haven't tested this thoroughly, but remapping topics and services works as expected. I will look through the micro_ros_rcl in particular and check the flags and #ifdef statements and everything

I may spend some more time trying to figure out why a full build fails. It would be nice if we could get a full build, because it's going to be a pain to integrate upstream changes in the future with this new RCL_REMAPPING_ENABLED being decoupled from RCL_MICROROS.

I've attached a build of my libmicroros

libmicroros_yrc1000_iron.zip

Edit

Also, I essentially put in a blank definition for strtoll because there is no pre-existing definition for it on the controller. I can put in a real implementation before anything gets merged. Would it be best to do that in micro_ros_rcl or somewhere in the MotoROS2 repository?

I changed a couple of things with MOTOPLUS_LIBMICROROS_CONFIG_H. Right now only iron is cmakedefine'd, but that wasn't a conscious decision, I was just trying to simplify things while I got it running properly the first time.

jimmy-mcelwain commented 3 weeks ago

I have it all building and remapping now without a ton of preprocessor modifications. The reason that I was having problems with the full build was because the rcl_node_type_cache wasn't getting initialized, which I didn't realize. The rcl_node_type_cache is used when RCL_MICROROS=OFF. It uses dynamic memory allocation, which I know is limited on our controllers, but in our case I doubt that it will be a problem. I will do a more thorough check on how much memory it uses in the future. rcl_node_type_cache gets automatically initialized in Jazzy, but in Iron, you have to deal with it manually. So whenever we upgrade to Jazzy we'll have to get rid of the cache initialization that I added to motoros2.

I there are branches on motoros2, micro_ros_motoplus, micro_ros_rcl, and micro_ros_rcl_buildscripts that have all been updated.

I started to rename branches but I realized that could have negative impacts part of the way through, so the naming convention doesn't make much sense right now, but it's all linked here and once it gets merged, it should be alright. So if you can't automatically pull one of the branches, that may be the reason.

What I have tested has worked on Iron. I need more thorough testing though. Remapping works. I have only tested a YRC1000 at the moment. motoros2 humble builds, but I haven't tested it with the changes. The iron specific stuff is #ifdef'd away, so I don't expect problems.

I'm also not positive how we want to handle merging everything into main. I did not update the one_func_per_file for rosidl in micro_ros_motoplus. Should I create a new branch on our fork of rosidl for iron? Right now I am referencing an iron commit from the official repo. And micro_ros_motoplus_buildscripts will need some changes if we want the main branch to build both humble and iron. That probably won't be too difficult, though.

Also I know it's not passing all the checks but I will make sure that it does when I rebase

Here's a build of libmicroros for iron with a YRC1000. micro_ros_motoplus_yrc1000_iron-20241108-dbg_1731103677.zip

ted-miller commented 2 weeks ago

Iron Irwini will be skipped because it is EOL next month. The work on this topic will be used to implement Jazzy.

Yaskawa-Global / motoros2

Support Iron Irwini #52