ros2 / ci

ROS 2 CI Infrastructure
http://ci.ros2.org/
Apache License 2.0
48 stars 30 forks source link

Use venv from system instead of pinned virtualenv as venv provider. #644

Closed nuclearsandwich closed 2 years ago

nuclearsandwich commented 2 years ago

The pinned version of virtualenv 16 is broken with Python 3.10 and the pin has been around as long as it has due to virtualenv detection interfering with our install-scripts settings for libexec packages.

As part of other patches to enable python3.10 support for using a virtualenv with our CI has been moved completely off the table.

If successful we'll have to compare this approach, which may only work with virtualenv/python/setuptools above a certain threshold but it would then be on the table.

We've been down the venv road before with #385 but ultimately reverted to virtualenv in #399 due to "stability issues", notably https://github.com/ros2/build_farmer/issues/266.

nuclearsandwich commented 2 years ago
nuclearsandwich commented 2 years ago

I opted to run CI for Foxy to verify that the venv changes don't affect stable distros on Focal. In the interest of not totally locking up CI I have made the supposition that If Foxy works so too will Galactic but I can run further builds if folks don't think that's a sound assumption.

nuclearsandwich commented 2 years ago
  • Ubuntu-aarch64 Focal @ Foxy Build Status

Well this build shows that on Focal we still have issues:

Traceback (most recent call last):
  File "/home/jenkins-agent/workspace/ci_linux-aarch64/ws/install/launch/lib/python3.8/site-packages/launch/launch_service.py", line 228, in _process_one_event
    await self.__process_event(next_event)
  File "/home/jenkins-agent/workspace/ci_linux-aarch64/ws/install/launch/lib/python3.8/site-packages/launch/launch_service.py", line 248, in __process_event
    visit_all_entities_and_collect_futures(entity, self.__context))
  File "/home/jenkins-agent/workspace/ci_linux-aarch64/ws/install/launch/lib/python3.8/site-packages/launch/utilities/visit_all_entities_and_collect_futures_impl.py", line 45, in visit_all_entities_and_collect_futures
    futures_to_return += visit_all_entities_and_collect_futures(sub_entity, context)
  File "/home/jenkins-agent/workspace/ci_linux-aarch64/ws/install/launch/lib/python3.8/site-packages/launch/utilities/visit_all_entities_and_collect_futures_impl.py", line 45, in visit_all_entities_and_collect_futures
    futures_to_return += visit_all_entities_and_collect_futures(sub_entity, context)
  File "/home/jenkins-agent/workspace/ci_linux-aarch64/ws/install/launch/lib/python3.8/site-packages/launch/utilities/visit_all_entities_and_collect_futures_impl.py", line 38, in visit_all_entities_and_collect_futures
    sub_entities = entity.visit(context)
  File "/home/jenkins-agent/workspace/ci_linux-aarch64/ws/install/launch/lib/python3.8/site-packages/launch/action.py", line 108, in visit
    return self.execute(context)
  File "/home/jenkins-agent/workspace/ci_linux-aarch64/ws/install/launch_ros/lib/python3.8/site-packages/launch_ros/actions/node.py", line 453, in execute
    ret = super().execute(context)
  File "/home/jenkins-agent/workspace/ci_linux-aarch64/ws/install/launch/lib/python3.8/site-packages/launch/actions/execute_process.py", line 823, in execute
    self.__expand_substitutions(context)
  File "/home/jenkins-agent/workspace/ci_linux-aarch64/ws/install/launch/lib/python3.8/site-packages/launch/actions/execute_process.py", line 668, in __expand_substitutions
    cmd = [perform_substitutions(context, x) for x in self.__cmd]
  File "/home/jenkins-agent/workspace/ci_linux-aarch64/ws/install/launch/lib/python3.8/site-packages/launch/actions/execute_process.py", line 668, in <listcomp>
    cmd = [perform_substitutions(context, x) for x in self.__cmd]
  File "/home/jenkins-agent/workspace/ci_linux-aarch64/ws/install/launch/lib/python3.8/site-packages/launch/utilities/perform_substitutions_impl.py", line 26, in perform_substitutions
    return ''.join([context.perform_substitution(sub) for sub in subs])
  File "/home/jenkins-agent/workspace/ci_linux-aarch64/ws/install/launch/lib/python3.8/site-packages/launch/utilities/perform_substitutions_impl.py", line 26, in <listcomp>
    return ''.join([context.perform_substitution(sub) for sub in subs])
  File "/home/jenkins-agent/workspace/ci_linux-aarch64/ws/install/launch/lib/python3.8/site-packages/launch/launch_context.py", line 197, in perform_substitution
    return substitution.perform(self)
  File "/home/jenkins-agent/workspace/ci_linux-aarch64/ws/install/launch_ros/lib/python3.8/site-packages/launch_ros/substitutions/executable_in_package.py", line 79, in perform
    raise SubstitutionFailure(
launch.substitutions.substitution_failure.SubstitutionFailure: package 'demo_nodes_py' found at '/home/jenkins-agent/workspace/ci_linux-aarch64/ws/install/demo_nodes_py', but libexec directory '/home/jenkins-agent/workspace/ci_linux-aarch64/ws/install/demo_nodes_py/lib/demo_nodes_py' does not exist

The packaging jobs for Jammy look good though, demo_nodes_py has the libexec scripts:

steven@rizzo:~[130] > tar tf ~/Downloads/ros2-package-linux-x86_64.tar.bz2 | grep '^ros2-linux/lib/demo_nodes_py'
ros2-linux/lib/demo_nodes_py/
ros2-linux/lib/demo_nodes_py/add_two_ints_client
ros2-linux/lib/demo_nodes_py/add_two_ints_client_async
ros2-linux/lib/demo_nodes_py/add_two_ints_server
ros2-linux/lib/demo_nodes_py/listener
ros2-linux/lib/demo_nodes_py/listener_qos
ros2-linux/lib/demo_nodes_py/listener_serialized
ros2-linux/lib/demo_nodes_py/talker
ros2-linux/lib/demo_nodes_py/talker_qos

If that's the case we'll have to decide whether its worth using two different virtualenv commands or not using virtualenvs at all...

nuclearsandwich commented 2 years ago
  • Windows @ Focal Build Status

Also came back extremely negative.

nuclearsandwich commented 2 years ago
  • Ubuntu Focal @ Rolling Build Status

  • Ubuntu-aarch64 Focal @ Rolling Build Status

It's pretty weird to me that Rolling on Focal comes back clean... I'd expect it to have the same problems as Focal @ Foxy and RHEL @ Rolling both of which exhibit test failures related to setuptools ignore_options within venvs.

nuclearsandwich commented 2 years ago
  • Windows @ Focal Build Status

Also came back extremely negative.

This doesn't actually make sense / have bearing on the venv usage since Windows builds do not use the --do-venv argument at all. I've started a rebuild of Foxy rclpy on Windows to confirm the tests fail there as well. Build Status

nuclearsandwich commented 2 years ago

Okay reviewing the jobs I've identified the following:

All of the builds in https://github.com/ros2/ci/pull/644#issuecomment-1074541795 were run with setuptools from pypi whereas I had intended to run using the setuptools version available in system packages (pairing with #643). In those builds there is a test failure discrepance

Foxy on Focal has test failures that don't express in the Rolling on Focal builds, which has me absolutely stymied. That, plus failing to realize that the failures on the RHEL build were present on master led me to add a commit to this PR which only uses the the venv utility for Python 3.10. But on closer inspection that may not be required and if we can just use venv everywhere I find that preferable to any situation where we only use it conditionally.

I'm running test_launch_ros in foxy/focal again with setuptools and venv from apt: Build Status And will compare it to this main branch build of the same: Build Status

nuclearsandwich commented 2 years ago

I'm running test_launch_ros in foxy/focal again with setuptools and venv from apt: Build Status And will compare it to this main branch build of the same: Build Status

Well using system setuptools did not resolve the errors with Foxy and those errors aren't present using setuptools, pip, and virtualenv==16.7.9

So why does this build of Rolling on Focal not exhibit these failures in the same situation (if it does I'll be relieved!) Build Status

nuclearsandwich commented 2 years ago

Retesting Foxy on Focal with a conditional that continues to use the virtualenv utility when args.ros_distro == 'foxy' but I'm using virtualenv 20 from the apt repositories. If that doesn't work I'll add an additional docker layer to install virtualenv==16.7.9 from pip instead of python3-virtualenv.

Build Status

nuclearsandwich commented 2 years ago

Hopefully the final build, it was indeed necessary to use virtualenv==16 from pip.

Build Status

nuclearsandwich commented 2 years ago

I have made the supposition that If Foxy works so too will Galactic but I can run further builds if folks don't think that's a sound assumption.

Since Foxy didn't work and is now getting special handling I have nearly forgotten to test Galactic to see whether it aligns with Foxy or Rolling on this issue. Either way that should give us a nice bisect in figuring out what is going on.

Build Status