Open gavanderhoorn opened 1 year ago
If I'm not mistaken, Iron support has just been released upstream: micro-ROS Iron: 4.1.0.
@ted-miller: I believe you suggested adding this to the 0.2.0
milestone in #120?
Edit: oh wait, I see I already added it. Nm.
Something to be aware of: https://github.com/ros2/rmw_dds_common/pull/68.
Almost starts to pay off to implement some sort of CI that checks message compatibility between different ROS 2 distributions.
Doesn't build yet. But I'm working on this: https://github.com/Yaskawa-Global/motoros2/tree/iron_wip
update VS project with Iron targets for all supported controllers
This kinda sucks to do and the current method does not scale well. I'm assuming the cmake-passthrough
mode of M+SDK would make compilation easier. But I'm not sure how intellisense would behave without the stuff that was manually added.
Very quick/preliminary testing. But I wanted to write it down before I leave.
AFAICT, libmicroros seems OK. (More testing needed)
Other things I noticed using the Iron version
set S2C1402=3 and reboot
twice. This was probably a mistake on my part, but I really thought I set it right on the first attempt.invalid job
alarm. Then I deleted the job, rebooted, and let it regen a new job. I then got the alarm again. Something fishy here.joint_states
, the cross-axis-coupling compensation routine isn't perfect. I'm seeing slight fluctuations in T axis when jogging others. But this is likely due to rounding errors on our part and could probably be fixed with https://github.com/Yaskawa-Global/motoros2/issues/199.Segmentation fault (core dumped)
. As I'm typing this, I realize that I sourced a humble
version of these messages and never rebuilt them for iron
. So it's probably that.I got the invalid job alarm. Then I deleted the job, rebooted, and let it regen a new job. I then got the alarm again. Something fishy here
I did it yet again, and the alarm didn't occur. So, it seems that I just failed to delete the job. But I'm confused, because I was 90% sure that I did delete the job. Perhaps I just did a cpu-reset too quickly after the deletion???
When attempting to use any of the motoros2_interfaces messages, I get Segmentation fault (core dumped). As I'm typing this, I realize that I sourced a humble version of these messages and never rebuilt them for iron. So it's probably that.
Yeah, I rebuilt the messages for iron
and it worked.
I had one terminal open that was echoing robot_status and forgot to stop the echo. It was just running in the background.
I opened another terminal and started making service calls. The majority of them "worked", but returned an error message.
ros2 service call /read_group_io motoros2_interfaces/srv/ReadGroupIO "{address: 7001}"
2024-03-26 13:14:07.410 [RTPS_TRANSPORT_SHM Error] Failed init_port fastrtps_port7411: open_and_lock_file failed -> Function open_port_internal
requester: making request: motoros2_interfaces.srv.ReadGroupIO_Request(address=7001)
response:
motoros2_interfaces.srv.ReadGroupIO_Response(result_code=0, message='Success', success=True, value=192)
But if I go back to the original terminal, I can make those service calls without any issue.
It seems that fastrtps might not be threadsafe. More testing is needed to submit a ticket to eProsima.
A quick google search said that someone got a similar error with a version mismatch.
My docker container:
ros2 doctor --report | grep fastrtps
2024-03-26 13:17:01.132 [RTPS_TRANSPORT_SHM Error] Failed init_port fastrtps_port7411: open_and_lock_file failed -> Function open_port_internal
rosidl_typesupport_fastrtps_cpp : latest=3.0.2, local=3.0.2
rmw_fastrtps_shared_cpp : latest=7.1.3, local=7.1.3
rosidl_dynamic_typesupport_fastrtps : latest=0.0.2, local=0.0.2
rosidl_typesupport_fastrtps_c : latest=3.0.2, local=3.0.2
rmw_fastrtps_cpp : latest=7.1.3, local=7.1.3
fastrtps_cmake_module : latest=3.0.2, local=3.0.2
middleware name : rmw_fastrtps_cpp
I'm entirely not sure which of our packages we would want to compare that too. I guess it would be the Agent, since it's doing the translation between fastrtps and microxrcedds.
The Agent container:
ros2 doctor --report | grep fastrtps
/opt/ros/humble/lib/python3.10/site-packages/ros2doctor/api/__init__.py: 154: UserWarning: Fail to call QoSCompatibilityReport class functions.
rosidl_typesupport_fastrtps_cpp : latest=2.2.2, local=2.2.2
rmw_fastrtps_shared_cpp : latest=6.2.6, local=6.2.5
rosidl_typesupport_fastrtps_c : latest=2.2.2, local=2.2.2
rmw_fastrtps_cpp : latest=6.2.6, local=6.2.5
fastrtps_cmake_module : latest=2.2.2, local=2.2.2
middleware name : rmw_fastrtps_cpp
Definitely some differences. Don't know if they're important.
I was surprised that control_msgs
wasn't included by default in the Iron docker image. I feel like it was included in Humble.
I had to manually clone and build this.
After running the FJT script, I got this message in my client.
control_msgs.action.FollowJointTrajectory_Result(error_code=-500301, error_string='Final position was outside tolerance. Check robot safety-limits that could be inhibiting motion. [group_1/joint_1: 0.0000 deviation] [group_1/joint_5: 0.0000 deviation]')
This is very confusing.
Looking at the logging script, I think it might be due to the time tolerance rather than position.
[1711461114.98510480] [192.168.1.31:50724]: 2024-03-26 13:51:54.983911 Trajectory complete
[1711461114.98533487] [192.168.1.31:50724]: 2024-03-26 13:51:54.983911 FJT using DEFAULT goal time tolerance: 500000000 ns
[1711461114.98539233] [192.168.1.31:50724]: 2024-03-26 13:51:54.984111 FJT action failed
[1711461114.98543978] [192.168.1.31:50724]: 2024-03-26 13:51:54.984111 Final position was outside tolerance. Check robot safety-limits that could be inhibiting motion. [group_1/joint_1: 0.0000 deviation] [group_1/joint_5: 0.0000 deviation]
I might need to open a new issue for this.
Except for the misc notes above, it seems to be working. I was able to run the topics, services, and both types of motion.
I got the invalid job alarm. Then I deleted the job, rebooted, and let it regen a new job. I then got the alarm again. Something fishy here
I did it yet again, and the alarm didn't occur. So, it seems that I just failed to delete the job. But I'm confused, because I was 90% sure that I did delete the job. Perhaps I just did a cpu-reset too quickly after the deletion???
I seem to remember having run into this as well, and then we added this to the Alarm: 8014[1] FAQ:
If the alarm is posted again after restarting the controller, make sure to allow sufficient time for the controller to properly delete the job (flushing changes to the file system may take some time). Allow for at least 20 seconds between deleting the job and restarting the controller.
Could be what you experienced.
When attempting to use any of the motoros2_interfaces messages, I get Segmentation fault (core dumped). As I'm typing this, I realize that I sourced a humble version of these messages and never rebuilt them for iron. So it's probably that.
Yeah, I rebuilt the messages for
iron
and it worked.
I would've expected that yes.
Key things have changed between Humble and Iron. Especially https://github.com/ros2/rmw_dds_common/pull/68 breaks everything between those versions.
I had one terminal open that was echoing robot_status and forgot to stop the echo. It was just running in the background.
I opened another terminal and started making service calls. The majority of them "worked", but returned an error message.
[..]
A quick google search said that someone got a similar error with a version mismatch.
[..]
Definitely some differences. Don't know if they're important.
The set of packages you've used to built M+ libmicroros
is old. It's from around July last year. I would suspect that to be the cause here before anything else.
It's also slightly surprising to see the SHM transport mentioned in the error message. AFAIK, that's explicitly disabled in the Agent Docker image.
I was surprised that
control_msgs
wasn't included by default in the Iron docker image. I feel like it was included in Humble.I had to manually clone and build this.
Didn't apt install ros-iron-control-msgs
work?
Slightly off-topic, but:
Looking at the logging script, I think it might be due to the time tolerance rather than position.
[1711461114.98510480] [192.168.1.31:50724]: 2024-03-26 13:51:54.983911 Trajectory complete [..]
is your debug listener out-of-date? That formatting (the date duplication) was fixed a long time ago.
The set of packages you've used to built M+ libmicroros is old. It's from around July last year. I would suspect that to be the cause here before anything else.
That's fair.
I had to manually clone and build this.
Didn't apt install ros-iron-control-msgs work?
I didn't know that had an apt package.
is your debug listener out-of-date?
Extremely. I keep forgetting to update it.
The set of packages you've used to built M+ libmicroros is old. It's from around July last year. I would suspect that to be the cause here before anything else.
Do you have a trick to updating this? I'm just going through the repos list and manually checking for newer versions. I'm assuming the following repos are applicable:
eProsima/Micro-CDR - No changes
eProsima/Micro-XRCE-DDS-Client (with private modifications) - Update needed
micro-ROS/rmw-microxrcedds.git - Inconsequential changes
micro-ROS/micro-ROS-Agent - There is a newer version of this involving "thread". But it really seems inconsequential.
I have a tool which checks our .repos
against upstream. But part of it is manual work, yes.
I'll take a look tomorrow.
Edit:
Inconsequential changes
I always either update everything, or nothing.
Things may look inconsequential, but specific releases have been tested against/with each other. To avoid having to vet things ourselves again at all levels, it's best to stick (or at least start) with whatever upstream has tagged as a release.
I pulled the latest Micro-XRCE-DDS-Client and applied the custom changes for a quick test.
I was not able to reproduce the original issue after updating. But I can't say with any certainty that the problem was fixed since I didn't reproduce the original issue multiple times. (hindsight being 20/20 and all...)
Combined with the latest test build of Iron M+ libmicroros
(shared with you in the other thread), https://github.com/Yaskawa-Global/motoros2/compare/main...gavanderhoorn:iron_no_remap_rules should build.
Edit: this is of course just a work-around for now.
@gavanderhoorn Is Iron still of any benefit? I've asked @jimmy-mcelwain to start looking at this. But I see that microros now supports Jazzy.
Yes. We should support Humble, Iron and Jazzy.
Humble and Jazzy are LTS, so will be around for quite some time.
Iron is not an LTS, but should prepare us for Jazzy.
Updates to micro_ros_motoplus
: Yaskawa-Global/micro_ros_motoplus#3.
Tasks:
libmicroros
build infra with Iron supportmicro_ros_motoplus
: Yaskawa-Global/micro_ros_motoplus#3_buildscripts
rcl
changes