ros2 / rmw_connextdds

ROS 2 RMW layer for RTI Connext DDS Professional and RTI Connext DDS Micro.
Apache License 2.0
48 stars 33 forks source link

Replace ros2/rmw_connext with rticommunity/rmw_connextdds #9

Closed asorbini closed 3 years ago

asorbini commented 3 years ago

This issue tracks several PRs aimed at switching the RMW implementation for RTI Connext DDS included in the ROS2 source tree from the one provided by ros2/rmw_connext (rmw_connext_cpp) to the one provided by rticommunity/rmw_connextdds (rmw_connextdds) in preparation for the ROS2 Galactic release.

This ros2.repos can be used to build the source tree using forks of all affected repositories.

emersonknapp commented 3 years ago

Based on this ticket - are you also planning to remove rmw_connext from the default ros2.repos here https://github.com/ros2/ros2/blob/master/ros2.repos#L290 ?

wjwwood commented 3 years ago

@emersonknapp Once we get everything in place, I think that's the plan. We still have to review these pull requests of course.

emersonknapp commented 3 years ago

:+1: just trying to help make sure that list is complete - https://github.com/ros2/ci will need updates too

asorbini commented 3 years ago

@emersonknapp I plan on creating a PR for ros2/ci next. I've already gone through the exercise of finding all required changes, but I haven't yet got around to applying them in a fork.

Once I create that PR, I'll likely need some guidance to make sure things look good, since I'm not sure how we will approach these updates (especially when it comes to making sure nothing breaks catastrophically :) ).

In general you are right that all these changes will need to be coordinated with a timely update of the default ros2.repos (for which I will also create a PR soon), and all the necessary updates to ci.ros2.org.

I also modified ros2.repos to build the source tree with all the forked repos, which I've been using to test the changes using ci_launcher.

clalancette commented 3 years ago

By the way, I realized one other thing. Once we merge all of these PRs and get a successful run on https://ci.ros2.org, we are also going to have to do a bloom release and get it onto https://build.ros2.org right away. That's because all Rpr jobs are going to fail until we get a successful rmw_connextdds release done via bloom.

ivanpauno commented 3 years ago

I think that some of the PRs above are needed to make CI pass with rmw_connextdds, but others are only removing support for rmw_connext_cpp.

I would first only merge the ones in the first group, which is going to make things easier IMO.

PRs required to make rmw_connextdds pass CI:

Nice to merge:

Only removing support for rmw_connext_cpp/rmw_connext_dynamic_cpp:

emersonknapp commented 3 years ago

we are also going to have to do a bloom release and get it onto https://build.ros2.org right away

For sequencing - I would expect ^ to be the first step once it has been determined to work correctly. It seems to me that there is definitely an order of operations that results in nothing being broken, and without having to synchronize any of the events. From my perspective, that looks something like:

Much of the above consideration seems to be concerned with atomically replacing, rather than the above approach of adding the new thing, stabilizing, then removing the old thing.

asorbini commented 3 years ago

Thank you all for your input. I'd like to combine your suggestions, in that I think we could rank the PRs in order of priority, and do staggered merges, while also proceeding with a more incremental policy.

Indeed all these PRs try to replace rmw_connext_cpp with rmw_connextdds "atomically", and we might be making the transition more complicated than it needs to be.

From your suggestions, I'd like to propose the following plan:

Does this sound reasonable and along the lines of what you had in mind?

Please feel free to edit/add to the list.

ivanpauno commented 3 years ago

(@asorbini) New PR for ros2/ci to replace rmw_connext_cpp with rmw_connextdds on ci.ros2.org Or should this change also be incremental?

IMO, removing the old implementations from CI directly is fine.

Does this sound reasonable and along the lines of what you had in mind?

The plan sounds perfect to me, @clalancette can you double check if this is fine?

clalancette commented 3 years ago

IMO, removing the old implementations from CI directly is fine.

Yes, agreed. I think we should aim to completely replace rmw_connext_cpp, I just want to make sure we aren't causing problems for everyone else while we are doing it.

Two additional items:

  1. As we've discussed elsewhere, we need to change the cmake WARNING message in rmw_connextddsmicro to a STATUS message (so our CI will go green).
  2. In order for us to automatically branch from Rolling -> Galactic, the release repository needs to be under ros2-gbp. So I've gone ahead and created a release repository for you there (https://github.com/ros2-gbp/rmw_connextdds-release). If you need help with the bloom-release, please feel free to reach out to us; it can be a bit tricky the first time.

Other than that, the above plan looks great.

asorbini commented 3 years ago

I have made progress on several tasks (as reflected in the edited comment above).

@clalancette can you help me complete the "blooming" and releasing of rmw_connextdds? I also need help in figuring out how to make the proposed changes to ci.ros2.org (and if what I found is enough to disable rmw_connext_cpp and enable rmw_connextdds of course).

@ivanpauno can you help me progress the different PRs towards merging?

Thanks!

clalancette commented 3 years ago

@clalancette can you help me complete the "blooming" and releasing of rmw_connextdds?

I think you've figured this out, since you successfully opened the PR. I've now fixed the name of the release repository under ros2-gbp, so you should be able to re-release there. Ping me again once you've re-opened the PR, I'll take a look at the release repository.

I also need help in figuring out how to make the proposed changes to ci.ros2.org (and if what I found is enough to disable rmw_connext_cpp and enable rmw_connextdds of course).

Essentially you can change the code on https://github.com/ros2/ci , and then there is a field in https://ci.ros2.org/job/ci_launcher/build?delay=0sec where you can specify the CI branch to use. Unfortunately, that only works for branches pushed directly to https://github.com/ros2/ci. I've gone ahead and created a new branch https://github.com/ros2/ci/tree/asorbini/rmw_connextdds ; when you think you have the fixes you want to the CI code, open a PR against that branch and I'll merge it immediately. Then you can run the CI job and test it out. Once we've figured out all of the changes necessary, we can squash them all together and do a proper review.

asorbini commented 3 years ago

Ping me again once you've re-opened the PR, I'll take a look at the release repository.

I bloomed a new release (0.2.1), and opened a new PR with the ros2-gbp/rmw_connextdds-release repository (ros/rosdistro#28572).

Essentially you can change the code on https://github.com/ros2/ci , and then there is a field in https://ci.ros2.org/job/ci_launcher/build?delay=0sec where you can specify the CI branch to use. Unfortunately, that only works for branches pushed directly to https://github.com/ros2/ci. I've gone ahead and created a new branch https://github.com/ros2/ci/tree/asorbini/rmw_connextdds ; when you think you have the fixes you want to the CI code, open a PR against that branch and I'll merge it immediately. Then you can run the CI job and test it out. Once we've figured out all of the changes necessary, we can squash them all together and do a proper review.

Thank you for the setup work, I'll try my hand at running with these changes and report back!

clalancette commented 3 years ago

All right, we merged and the buildfarm tried to build it, but it failed:

https://build.ros2.org/view/Rbin_uF64/job/Rbin_uF64__rmw_connextdds_common__ubuntu_focal_amd64__binary/2/consoleFull

The interesting thing there is that it found Connext during initial CMake find_package time, but failed to link to it at the end. It sort of feels like a missing LD_LIBRARY_PATH or similar somewhere, but I'm not sure. It needs some investigation.

asorbini commented 3 years ago

This ouput from cmake is a bit suspicious to me:

13:55:41 -- Found rti_connext_dds_cmake_module: 0.2.1 (/opt/ros/rolling/share/rti_connext_dds_cmake_module/cmake)
13:55:41 -- Loaded RTI Connext DDS environment:
13:55:41 -- - RTI Connext DDS Professional:
13:55:41 --   - Found: false
13:55:41 --   - Version: 
13:55:41 --   - Directory: 
13:55:41 --   - Architecture: 

Usually rti_connext_dds_cmake_module generates an ...-extras.cmake file that stores the value for NDDSHOME it found during build.

I went to check the build job for that module and it seems like, while the plan succeeded, the build wasn't actually able to detect Connext:

13:52:57 -- Invalid CONNEXTDDS_DIR = ''
13:52:57 CMake Warning at cmake/rti_build_helper.cmake:380 (message):
13:52:57   no CONNEXTDDS_DIR nor NDDSHOME specified
13:52:57 Call Stack (most recent call first):
13:52:57   cmake/rti_build_helper.cmake:554 (rti_load_connextddsdir)
13:52:57   CMakeLists.txt:25 (rti_find_connextpro)

I think I know the reason for this: that module expects NDDSHOME/CONNEXTDDS_DIR to already be in the environment, and it "fails" (with a warning) if both are empty. In order to support the dependency from the Connext Debian package I will need to add some logic to try its installation path if the variables are empty (e.g. by searching for /opt/rti.com/rti_connext_dds-A.B.C).

Do you know how this is handled by the current connext_cmake_module?

EDIT: My analysis fails to explain why the build of rmw_connextdds_common succeeded anyway. I was thinking it was because it called the FindRTIConnextDDS.cmake script again, but in order to do that it would need a non-empty CONNEXTDDS_DIR to add to CMAKE_MODULE_PATH. I'll grab some lunch and think it through a bit more with a full stomach 😅

asorbini commented 3 years ago

I think the reason why the build for rmw_connextdds_common succeeded and was able to detect the Connext installation is because it implicitly loaded rosidl_typesupport_connext_c, likely as a transitive dependency (e.g. via rosidl_default_runtime).

13:55:40 -- Using all available rosidl_typesupport_c: rosidl_typesupport_connext_c;rosidl_typesupport_fastrtps_c;rosidl_typesupport_introspection_c

The linker step failed because rti_connext_dds_cmake_module didn't generate the "env hook" which would have configured the shell environment:


13:52:57 -- RTI Connext DDS Professional: NOT FOUND
13:52:57 -- Invalid CONNEXTDDS_DIR = ''
13:52:57 -- Invalid RTIMEHOME = ''
13:52:57 -- RTIMEHOME not found
13:52:57 -- RTI Connext DDS Micro: NOT FOUND
13:52:57 -- no environment hook generated
``
nuclearsandwich commented 3 years ago

I didn't realize that the new connextdds rmw also had a new _cmake_module provider. In order for that cmake_module provider to pick up the non-Debian standard RTI libraries the setup script needs to be sourced. In fact it needs to be patched to be sh-compatible and then sourced. Here's the relevant lines of how this is done in connext_cmake_module:

https://github.com/ros2-gbp/rosidl_typesupport_connext-release/blob/debian/rolling/connext_cmake_module/debian/rules.em#L67-L72 https://github.com/ros2-gbp/rosidl_typesupport_connext-release/blob/debian/rolling/connext_cmake_module/debian/rules.em#L24-L28

I'll open a PR on https://github.com/ros2-gbp/rmw_connextdds-release/blob/debian/rolling/rti_connext_dds_cmake_module/debian/rules.em implementing the same patch and once that is FF-merged in a re-blooming of the repository should pick up the new changes.

nuclearsandwich commented 3 years ago

@asorbini I've opened https://github.com/ros2-gbp/rmw_connextdds-release/pull/1 which I think is the missing piece. Once that's done the _cmake_module packages env hook will propagate the necessary info to other ROS packages so they don't need to directly source the rtisetenv script.

asorbini commented 3 years ago

I merged the PR, thank you @nuclearsandwich for catching this missing piece!

I will re-bloom the repository and keep my fingers crossed.

asorbini commented 3 years ago

Re-bloomed and opened a PR.

asorbini commented 3 years ago

Thank you @clalancette for merging the PR.

Unfortunately it seems like the build jobs failed again during the dh-shlibdeps step.

This time, rti_connext_dds_cmake_module correctly generated an env hook, but I think the issue is that I'm missing a <buildtool_export_depend>ament_cmake</buildtool_export_depend> in rti_connext_dds_cmake_module/package.xml which connext_cmake_module/package.xml has instead. Same for other packages from ros2/rmw_connext, which have the same dependency, and also depend on rti-connext-dds-5.3.1,

Do you think this might be the problem?

asorbini commented 3 years ago

I opened ros2/rcl#900 to fix a test regression, which I think is the only remaining "obstacle" towards all green CI plans (pending a couple of re-runs after the PR is merged to make sure everything is stable).

See ci_linux#13921 for a run with the default ros2.repos and rmw_connextdds added via overlay.

ivanpauno commented 3 years ago

This time, rti_connext_dds_cmake_module correctly generated an env hook, but I think the issue is that I'm missing a ament_cmake in rti_connext_dds_cmake_module/package.xml which connext_cmake_module/package.xml has instead. Same for other packages from ros2/rmw_connext, which have the same dependency, and also depend on rti-connext-dds-5.3.1,

I don't think that was the problem, but adding that line should be fine. From the logs, the issue seems to be:

18:25:36 dpkg-shlibdeps: error: cannot find library /opt/rti.com/rti_connext_dds-5.3.1/lib/x64Linux3gcc5.4.0/libnddsc.so needed by debian/ros-rolling-rmw-connextdds-common/opt/ros/rolling/lib/librmw_connextdds_common_pro.so (ELF format: 'elf64-x86-64' abi: '0201003e00000000'; RPATH: '')

it's weird that it wasn't able to find the library

clalancette commented 3 years ago

So, one problem is that it looks like the patch to the release repository didn't stick; in https://build.ros2.org/view/Rbin_uF64/job/Rbin_uF64__rmw_connextdds_common__ubuntu_focal_amd64__binary/5/consoleFull , I don't see it trying to source /tmp/rtisetenv.sh.

However, even when I hacked that into my local build, I couldn't get it to work. The following is enough to be able to reproduce it locally:

echo "deb http://packages.ros.org/ros2-testing/ubuntu focal main" | sudo tee /etc/apt/sources.list.d/ros2-latest.list
sudo apt-get update
sudo apt-get install ros-rolling-desktop
sudo apt-get install ros-rolling-rosidl-typesupport-connext-c ros-rolling-rosidl-typesupport-connext-cpp
wget http://repo.ros2.org/ubuntu/building/pool/main/r/ros-rolling-rti-connext-dds-cmake-module/ros-rolling-rti-connext-dds-cmake-module_0.2.1-3focal.20210312.022203_amd64.deb
sudo dpkg -i ros-rolling-rti-connext-dds-cmake-module_0.2.1-3focal.20210312.022203_amd64.deb
wget http://repo.ros2.org/ubuntu/building/pool/main/r/ros-rolling-rmw-connextdds-common/ros-rolling-rmw-connextdds-common_0.2.1.orig.tar.gz
wget http://repo.ros2.org/ubuntu/building/pool/main/r/ros-rolling-rmw-connextdds-common/ros-rolling-rmw-connextdds-common_0.2.1-3focal.debian.tar.xz
tar -xvf ros-rolling-rmw-connextdds-common_0.2.1.orig.tar.gz
cd ros-rolling-rmw-connextdds-common-0.2.1
tar -xvf ../ros-rolling-rmw-connextdds-common_0.2.1-3focal.debian.tar.xz
debuild -uc -us

I don't have a solution right now, but at least this gives us a way to debug it.

@asorbini I'll suggest trying this out locally and seeing what it takes to make it work. It may also be worthwhile to run a similar procedure on the older rmw_connext_cpp libraries to see what steps they take to avoid this problem.

ruffsl commented 3 years ago

Not sure if it's entirely applicable, but you can check out the docker file here as a reference example for environmental setup of connext 5.3.1 or 6.0.1:

https://github.com/ros-swg/turtlebot3_demo/blob/master/Dockerfile

asorbini commented 3 years ago

I don't have a solution right now, but at least this gives us a way to debug it.

@asorbini I'll suggest trying this out locally and seeing what it takes to make it work. It may also be worthwhile to run a similar procedure on the older rmw_connext_cpp libraries to see what steps they take to avoid this problem.

@clalancette thank you for the tip, I'll give local debugging a try and compare the release repository with the one for rmw_connext_cpp to see if there are any more differences.

Meanwhile, I generated 0.3.0-1 last night (#28598) to test if those missing deps in package.xml will make any difference.

@ruffsl I think in this case the issue is more related to the specifics of Debian packaging than to how Connext is loaded in the environment. Nonetheless, I'll keep the Dockerfile in mind as reference, thanks!

asorbini commented 3 years ago

@ivanpauno thank you for merging, unfortunately, no dice ☚ī¸ Looks like it's debug time

nuclearsandwich commented 3 years ago

So, one problem is that it looks like the patch to the release repository didn't stick; in build.ros2.org/view/Rbin_uF64/job/Rbin_uF64rmw_connextdds_commonubuntu_focal_amd64__binary/5/consoleFull , I don't see it trying to source /tmp/rtisetenv.sh.

That package is not supposed to source the RTI script. Everything downstream of rti_connextdds_cmake_module should get its environment set by the hooks created in that package.

@asorbini one advantage of the ros_buildfarm scripts is that they can be used to generate local environments as well as on the buildfarm. If you install the ros_buildfarm python module (I recommend using the master branch python3 -mpip install git+https://github.com/ros-infrastructure/ros_buildfarm to reproduce what's running on our farm) You can use the command:

generate_release_script.py https://raw.githubusercontent.com/ros2/ros_buildfarm_config/ros2/index.yaml rolling default rmw_connextdds_common ubuntu focal amd64 > run.sh

to create a script which will use Docker to create the exact same container context that the buildfarm does. Prying into that environment requires a bit of specialized knowledge and Docker experience but it is doable. A brief version is that you can modify the script not to automatically clean up the last container (remove the --rm flag from the final docker-run and then you can restart the container, stop the python3 control process (docker exec -u root $CNAME pkill -SIGSTOP python3) and then start mucking around in it to figure out what's going on.

Edit: Corrected command which is generate_release_script.py rather than generate_release_job.py.

asorbini commented 3 years ago

@nuclearsandwich I installed the ros_buildfarm from git, and tried running the command you provided.

After figuring out the authentication (by creating an OAuth token on GitHub, with public_repo permissions), I'm getting an "Access Denied" error (asorbini is missing the View/Configure permission).

Anything I may be doing wrong?

asorbini commented 3 years ago

@clalancette I've been running the CI plans with branch asorbini/rmw_connextdds of ros2/ci, and with the manifest from ros2/ros2#1099.

The results were green this morning, but now show a regression in test_tutorial_parameter_events__rmw_connextdds from demo_nodes_cpp (links in the task list above)

I tried running the test locally (and running parameter_events by hand too), and I noticed a failure with both rmw_connextdds and rmw_connext_cpp (which wasn't built by my CI runs).

As far as I can tell, the problem is that when running parameter_events with these two RMWs the output looks like this, while the test expects this (and that's what other RMWs produce).

Do you know of anything that might be causing the different output from the Connext RMWs? Since both of them show the same thing, I'm wondering if there's something that was added recently and it's missing in these implementations.

EDIT: The Windows plan also reported a failure in test_timer__rmw_connextdds from rcl, which I have seen failing at least one other time on Windows only in the past.

nuclearsandwich commented 3 years ago

After figuring out the authentication (by creating an OAuth token on GitHub, with public_repo permissions), I'm getting an "Access Denied" error (asorbini is missing the View/Configure permission).

Anything I may be doing wrong?

Hmm. You shouldn't need any authentication to clone repositories that are all public. GitHub was down when I composed that example command so perhaps I made a mistake.

nuclearsandwich commented 3 years ago

Ah, yes. I had a very critical mistake in my example. generatereleasejob.py tries to update the job on the buildfarm Jenkins instance, which does need Jenkins credentials that you do not and should not have.. The proper command is generatereleasescript.py. Sorry about that!

generate_release_script.py https://raw.githubusercontent.com/ros2/ros_buildfarm_config/ros2/index.yaml rolling default rmw_connextdds_common ubuntu focal amd64 > run.sh

All other instructions stand.

asorbini commented 3 years ago

After some investigation, I noticed that the build for rmw_connext_shared_cpp and other packages loading Connext via connext_cmake_module and its CMake find script, will link nddsc and its dependencies after -Wl,--no-as-needed (e.g. -Wl,--no-as-needed -lnddsc -lnddscore -lnddscpp -lrticonnextmsgcpp -lpthread -ldl -Wl,--as-needed).

This is not happening for rmw_connextdds_common, since rti_connext_dds_cmake_module uses the Connext 5.3.1 find script which doesn't add these linker flags.

Do you think it might have something to do with the failure from debhelper to find nddsc.so?

Another difference is that rmw_connextdds links nddsc with the full path instead of just as -lnddsc:

...
-Wl,-rpath,/opt/rti.com/rti_connext_dds-5.3.1/lib/x64Linux3gcc5.4.0:/opt/ros/rolling/lib:
     /opt/rti.com/rti_connext_dds-5.3.1/lib/x64Linux3gcc5.4.0/libnddsc.so
...

Finally, is there a way for me to make changes to the repository and test changes locally without first blooming a new release and have its PR merged?

clalancette commented 3 years ago

Do you think it might have something to do with the failure from debhelper to find nddsc.so?

Another difference is that rmw_connextdds links nddsc with the full path instead of just as -lnddsc:

Certainly one of those two things could be the problem. I don't really know since I wasn't involved in setting up rmw_connext_cpp originally.

Finally, is there a way for me to make changes to the repository and test changes locally without first blooming a new release and have its PR merged?

You can either do the "by-hand" method that I pointed out, or the docker method that @nuclearsandwich pointed out. Once you have things reasonably working there, I don't know of a further way to do testing except "live" on the buildfarm.

So my suggestion is that you make things work for you locally, then submit a PR and merge it. Then open the rosdistro PR. The cycle time to test things out is a little slow, but numbers are cheap so it doesn't matter if it takes a few iterations to get it right.

ivanpauno commented 3 years ago

After some investigation, I noticed that the build for rmw_connext_shared_cpp and other packages loading Connext via connext_cmake_module and its CMake find script, will link nddsc and its dependencies after -Wl,--no-as-needed (e.g. -Wl,--no-as-needed -lnddsc -lnddscore -lnddscpp -lrticonnextmsgcpp -lpthread -ldl -Wl,--as-needed).

I tracked down why that was the case, and adding that flag seems to be recommended in the rti Connext 5.3.1 platform notes:

For i86Linux3gcc4.8.2, x64Linux3gcc4.8.2, i86Linux3gcc5.4.0, i86Linux3.xgcc4.6.3,
x64Linux3gcc5.4.0, and x64Linux3.xgcc4.6.3, when running on Ubuntu CPU for dynamic
release and dynamic debug libraries, also use the following:
-Wl,--no-as-needed

We're using newer gcc versions, so IDK if that's still needed or not. The current error doesn't seem related to the lack of that flag anyway, that was causing undefined references to some symbols (see https://github.com/ros2/rcl/pull/55 for historical examples).

asorbini commented 3 years ago

You can either do the "by-hand" method that I pointed out, or the docker method that @nuclearsandwich pointed out. Once you have things reasonably working there, I don't know of a further way to do testing except "live" on the buildfarm.

So my suggestion is that you make things work for you locally, then submit a PR and merge it. Then open the rosdistro PR. The cycle time to test things out is a little slow, but numbers are cheap so it doesn't matter if it takes a few iterations to get it right.

Yep, I was able to reproduce locally with @nuclearsandwich's instructions. I was thinking more along the lines of "is there a way for me to test the release process with a custom branch/tag of the repository so that I may make changes to the source repo and not have to regenerate the *.orig.tar.gz by hand?" (because I "suspect" the Docker container creates everything from the info in rolling/distributions.yaml and the release repo).

I think I'll do some tests with the "manual approach" that you suggested, and recreate an upstream tarball with modifications as needed.

asorbini commented 3 years ago

I think I found the issue!

When building with Connext 5.3.1, the CMake helper function rti_find_connextpro() defines an imported target RTIConnextDDS::c_api to mimic the behavior of the FindRTIConnextDDS.cmake script in 6.0.1. Unfortunately, it looks like the imported library is currently missing property IMPORTED_SO_NAME, which is causing CMake to use the full path of nddsc when linking rmw_connextdds_common.so, instead of -lnddsc.

Setting IMPORTED_SO_NAME true seems to fix the issue, and allows dh-shlibdeps to successfully find nddsc.

I will push the change and re-bloom the repository to validate the fix (also re-run some CI tests to make sure the property works across platforms).

EDIT: new PR for 0.3.1-1

clalancette commented 3 years ago

It looks like it built! Woohoo!

So here is a follow-up list:

  1. It looks like the package https://build.ros2.org/job/Rbin_uF64__rmw_connextddsmicro__ubuntu_focal_amd64__binary/ is an empty package. That makes sense, since we don't currently have Connext Micro installed anywhere, and so the build just doesn't do anything. Given that that is the case, I'll highly suggest that we remove that package, so we don't give users the wrong impression on what is supported. The easiest way to do this is to add a file to the release repository that contains a list of the packages that you want ignored. See https://github.com/ros2-gbp/joystick_drivers-release/blob/master/rolling.ignored for an example. I'll suggest you create a "rolling.ignored" file containing "rmw_connextddsmicro", and then re-bloom.
  2. At this point you should be able to point your apt installation at the testing repository and install the packages. Something like echo "deb http://packages.ros.org/ros2-testing/ubuntu focal main" > /etc/apt/sources.list.d/ros2-latest.list, followed by sudo apt-get install ros-rolling-rmw-connextdds should do it. I'll suggest you do that and do some basic testing to make sure things work for you.
  3. Open up a PR based on https://github.com/ros2/ci/tree/asorbini/rmw_connextdds . We'll need to review and merge that.
  4. Merge https://github.com/ros2/rmw_implementation/pull/182 .
  5. Merge https://github.com/ros2/ros2/pull/1099 .
  6. Wait one overnight to make sure CI is happy.
  7. Assuming CI is happy, then we can start looking at removing the old RMW from ros2.repos and elsewhere.

We're getting close here!

asorbini commented 3 years ago
  1. It looks like the package https://build.ros2.org/job/Rbin_uF64__rmw_connextddsmicro__ubuntu_focal_amd64__binary/ is an empty package. That makes sense, since we don't currently have Connext Micro installed anywhere, and so the build just doesn't do anything. Given that that is the case, I'll highly suggest that we remove that package, so we don't give users the wrong impression on what is supported. The easiest way to do this is to add a file to the release repository that contains a list of the packages that you want ignored. See https://github.com/ros2-gbp/joystick_drivers-release/blob/master/rolling.ignored for an example. I'll suggest you create a "rolling.ignored" file containing "rmw_connextddsmicro", and then re-bloom.

Good idea! I had noticed the <release>.ignored in other release repositories, and I was already considering adding one. I added rmw_connextddsmicro to the ignored list and re-bloomed the repository.

  1. At this point you should be able to point your apt installation at the testing repository and install the packages. Something like echo "deb http://packages.ros.org/ros2-testing/ubuntu focal main" > /etc/apt/sources.list.d/ros2-latest.list, followed by sudo apt-get install ros-rolling-rmw-connextdds should do it. I'll suggest you do that and do some basic testing to make sure things work for you.

I did a local test, and the packages installed and ran a hello world talker/listener with no issue (including some basic command line tools, like ros2 node, ros2 param).

  1. Open up a PR based on https://github.com/ros2/ci/tree/asorbini/rmw_connextdds . We'll need to review and merge that.

I opened a PR on ros2/ci.

  1. Merge ros2/rmw_implementation#182 .

  2. Merge ros2/ros2#1099 .

  3. Wait one overnight to make sure CI is happy.

  4. Assuming CI is happy, then we can start looking at removing the old RMW from ros2.repos and elsewhere.

We're getting close here!

Definitely starting to see "the light at the end of the ~tunnel~ merge" 🎉

One final note on the CI tests: I re-ran the plans (with my CI branch), and it seems like the failure in test_tutorial_parameter_events__rmw_connextdds is consistent (see my previous comment in this thread).

Here are the results from my latest runs:

ivanpauno commented 3 years ago

test_tutorial_parameter_events__rmw_connextdds

That one is new and started happening also with rmw_connext_cpp, I have opened a PR to fix it: https://github.com/ros2/demos/pull/491.

clalancette commented 3 years ago

All right, I've merged and deployed the changes to CI. Here's a job with no current changes, just to make sure things are working as I expect:

Here's a set of jobs with https://github.com/ros2/rmw_implementation/pull/182 and https://github.com/ros2/ros2/pull/1099 in place:

clalancette commented 3 years ago

OK, we have a lot of this in now. Here's what I think still needs to be done:

Before freeze (April 5):

Ideally before freeze, but can be done after:

clalancette commented 3 years ago

Here's a CI run with all of the open PRs to see where we are:

ivanpauno commented 3 years ago

My 2 cents about merge ordering:

clalancette commented 3 years ago

All of the failures in uncrustify were fixed by my latest commits in rclcpp, so that should go away. The only other failure was on Windows, but I think @asorbini knows about that one and is looking into it.

I'm going to go back through the open PRs and respond to comments now.

asorbini commented 3 years ago

The failure is slightly different from what I've seen before, but it still seems related to "clock jitter" on Windows (so to speak). I'll investigate the specific test case further to confirm it, but I would assess the failure as "minor" for now.

asorbini commented 3 years ago

@clalancette thank you for transferring the repository to the ros2 org 🎉

Here's some "follow up" items:

Q: should I run bloom again to generate a new release? (0.3.1-3, I still have a few things to verify before going to 0.4.0)

clalancette commented 3 years ago

One more CI with all of the things that have been merged, plus the changes in https://github.com/ros2/rcl/pull/903, https://github.com/ros2/rclpy/pull/698, and https://github.com/ros2/rclcpp/pull/1595:

clalancette commented 3 years ago

Q: should I run bloom again to generate a new release? (0.3.1-3, I still have a few things to verify before going to 0.4.0)

No, it shouldn't be necessary. It will all just be in place for when the next release gets done.

clalancette commented 3 years ago

All right, all things on the various lists here have been completed. In my opinion, this issue can be closed. @asorbini I'll let you do the honors.