RobotLocomotion / ros-drake-vendor

Maintainer scripts that package Drake in the ROS build farm
Other
1 stars 1 forks source link

Problems with hardcoded clang12 path on Ubuntu Jammy build #2

Closed j-rivero closed 5 months ago

j-rivero commented 6 months ago

While compiling Drake 1.24 I found a compilation error related to a hardcoded path of clang12:

  [177 / 290] [Prepa] Linking common/proto/libcall_python.a                                                                                                                                                                                                                                                                                                                                   
  ERROR: /home/buildfarm/ws/build/drake/drake-prefix/src/drake/bindings/pydrake/BUILD.bazel:55:37: Action bindings/pydrake/documentation_pybind.h failed: (Exit 1): mkdoc failed: error executing command (from target //bindings/pydrake:generate_pybind_documentation_header) bazel-out/k8-opt-exec-2B5CBBC6/bin/tools/workspace/pybind11/mkdoc -DDRAKE_COMMON_SYMBOLIC_EXPRESSION_DETAIL_HE
  ADER -DFMT_LOCALE -DFMT_SHARED -DSPDLOG_SHARED_LIB -DSPDLOG_COMPILED_LIB ... (remaining 865 arguments skipped)                                                                                                                                                                                                                                                                              

  Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging                                                                                                                                                                                                                                                                                
  Traceback (most recent call last):                                                                                                                                                                                                                                                                                                                                                          
    File "/home/buildfarm/ws/build/drake/drake-prefix/src/drake-build/_bazel_/41972d64e5fdb39d20266560ab1ca985/sandbox/processwrapper-sandbox/4065/execroot/drake/bazel-out/k8-opt-exec-2B5CBBC6/bin/tools/workspace/pybind11/mkdoc.runfiles/drake/tools/workspace/pybind11/mkdoc.py", line 816, in <module>                                                                                  
      main()                                                                                                                                                                                                                                                                                                                                                                                  
    File "/home/buildfarm/ws/build/drake/drake-prefix/src/drake-build/_bazel_/41972d64e5fdb39d20266560ab1ca985/sandbox/processwrapper-sandbox/4065/execroot/drake/bazel-out/k8-opt-exec-2B5CBBC6/bin/tools/workspace/pybind11/mkdoc.runfiles/drake/tools/workspace/pybind11/mkdoc.py", line 644, in main                                                                                      
      add_library_paths(parameters)                                                                                                                                                                                                                                                                                                                                                           
    File "/home/buildfarm/ws/build/drake/drake-prefix/src/drake-build/_bazel_/41972d64e5fdb39d20266560ab1ca985/sandbox/processwrapper-sandbox/4065/execroot/drake/bazel-out/k8-opt-exec-2B5CBBC6/bin/tools/workspace/pybind11/mkdoc.runfiles/drake/tools/workspace/pybind11/libclang_setup.py", line 43, in add_library_paths                                                                 
      raise RuntimeError(f'Library file {library_file} does NOT exist')                                                                                                                                                                                                                                                                                                                       
  RuntimeError: Library file /usr/lib/x86_64-linux-gnu/libclang-12.so does NOT exist

Probably coming from libclang_setup.py.

The default version on Jammy is clang-14 and the one available through rosdistro. I remember to have read somewhere in the Drake documentation that this could come from the mix of clang12 and clang14 packages but I don't see any in the system:

 buildfarm@b8a9933c0c78:~/ws$ sudo dpkg -l | grep clang                                                                                                                                                                                                                                                                                                                                      
  ii  clang                                              1:14.0-55~exp2                          amd64        C, C++ and Objective-C compiler (LLVM based), clang binary                                                                                                                                                                                                                      
  ii  clang-14                                           1:14.0.0-1ubuntu1.1                     amd64        C, C++ and Objective-C compiler                                                                                                                                                                                                                                                 
  ii  libclang-14-dev                                    1:14.0.0-1ubuntu1.1                     amd64        Clang library - Development package                                                                                                                                                                                                                                             
  ii  libclang-common-14-dev                             1:14.0.0-1ubuntu1.1                     amd64        Clang library - Common development package                                                                                                                                                                                                                                      
  ii  libclang-cpp14                                     1:14.0.0-1ubuntu1.1                     amd64        C++ interface to the Clang library                                                                                                                                                                                                                                              
  ii  libclang-dev                                       1:14.0-55~exp2                          amd64        clang library - Development package                                                                                                                                                                                                                                             
  ii  libclang1-14                                       1:14.0.0-1ubuntu1.1                     amd64        C interface to the Clang library                                                                                                                                                                                                                                                
  buildfarm@b8a9933c0c78:~/ws$                                                                                                                 

Problem should be easy to reproduce using the following Dockerfile:

FROM ros:rolling-ros-core
LABEL maintainer="Jose Luis Rivero <jrivero@openrobotics.org>"
ENV LANG C
ENV LC_ALL C
ARG DEBIAN_FRONTEND=noninteractive

# TODO: add missing dependencies
RUN apt-get update && apt-get install -y \
  build-essential \
  python3-rosdep \
  python3-colcon-common-extensions \
  vim \
  # use to clone repository below 
  git \
  # rosdistro miss
  libjchart2d-java

RUN mkdir -p /tmp/ws/src
WORKDIR /tmp/ws/src
RUN git clone https://github.com/RobotLocomotion/ros-drake-vendor.git \
    && cd ros-drake-vendor \
    && git checkout c7a5cbb34c0da0fe7158366660d774222142f0ae

ARG user=buildfarm
ARG group=buildfarm
ARG uid=1000
ARG gid=1000
RUN echo "${user} ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
RUN groupadd -g ${gid} ${group}
RUN useradd -u ${uid} -g ${group} -s /bin/sh -m ${user}
RUN chown buildfarm:buildfarm -R /tmp/ws/

RUN date
RUN cd /tmp/ws/src \
  && rosdep init \
  && rosdep update \
  && rosdep install --from-paths . --ignore-src -r -y

USER ${uid}:${gid}
WORKDIR /home/buildfarm
RUN echo 'build --jobs=5' >> user.bazelrc
RUN cp -a /tmp/ws /home/buildfarm/
WORKDIR /home/buildfarm/ws
ENTRYPOINT . /opt/ros/rolling/setup.sh \
  && colcon graph \
  && colcon build --executor sequential --event-handlers console_direct+ --cmake-args -DCMAKE_EXPORT_COMPILE_COMMANDS=1 || /bin/bash
jwnimmer-tri commented 6 months ago

The build-deps of Drake require libclang-12-dev to be installed in order to build:

https://github.com/RobotLocomotion/drake/blob/119c7f4c18afb0444a64fa188d24bbc64cf78924/setup/ubuntu/source_distribution/packages-jammy.txt#L4

I suppose your build recipe isn't yet accounting for mandatory setup, either by running install_prereqs.sh or accounting for the text-file-lists-of-packages that it uses?

j-rivero commented 6 months ago

I suppose your build recipe isn't yet accounting for mandatory setup, either by running install_prereqs.sh or accounting for the text-file-lists-of-packages that it uses?

Thanks for pointers and the info. I'm translating the dependencies into rosdistro keys for package.xml and anticipated that adding a clang12 specific key to the index might be problematic given that clang already exists and the rosdistro index (resolving to the default clang version in Jammy) and the ROS maintainers probably expect the packages to be using the default version of the compiler.

We also have the problem that when ROS Rolling changes the platform to the new Ubuntu Noble we will not have clang-12 anymore:

❯ rmadison clang-12
 clang-12 | 1:12.0.0-3ubuntu1~20.04.4 | focal-updates/universe  | riscv64
 clang-12 | 1:12.0.0-3ubuntu1~20.04.5 | focal-security/universe | amd64, arm64, armhf, i386, ppc64el, riscv64, s390x
 clang-12 | 1:12.0.0-3ubuntu1~20.04.5 | focal-updates/universe  | amd64, arm64, armhf, i386, ppc64el, s390x
 clang-12 | 1:12.0.1-19ubuntu3        | jammy/universe          | amd64, arm64, armhf, i386, ppc64el, riscv64, s390x

Given your answer I assume that there is no an easy way of making Drake to build/work with clang-14, is this correct or can we patch the code somehow to make it work?

jwnimmer-tri commented 6 months ago

I think what I'd failed to realize is that when building in the ROS buildfarm we can't simply use Ubuntu dependencies by their Ubuntu package name? We somehow need to transmogrify all of our dependency names into some kind of package.xml ROS thing?

That seems a lot more complicated than I'd realized. Is there really no "escape hatch" where we can just provide Build-Depends: a, b, c and be done with it? Is it mandatory to re-spell all of our dependencies using completely novel manifest infrastructure? I realize that if we were supporting non-Ubuntu ROS distros then a re-spelling would be required, but we're only targeting Ubuntu here.

Given your answer I assume that there is no an easy way of making Drake to build/work with clang-14, is this correct or can we patch the code somehow to make it work?

On macOS we already use XCode 15's libclang.so and (aside from one RobotLocomotion/drake#20561, already fixed) that newer version of Clang seems to work OK.

So, I would not be surprised if adding this patch file onto the ROS build was all we needed to solve it:

--- a/tools/workspace/pybind11/libclang_setup.py
+++ b/tools/workspace/pybind11/libclang_setup.py
@@ -38,7 +38,7 @@ def add_library_paths(parameters=None):
                 parameters.append(sdkroot)
     elif platform.system() == 'Linux':
         arch = platform.machine()
-        library_file = f'/usr/lib/{arch}-linux-gnu/libclang-12.so'
+        library_file = f'/usr/lib/{arch}-linux-gnu/libclang-14.so'
     if not os.path.exists(library_file):
         raise RuntimeError(f'Library file {library_file} does NOT exist')
     cindex.Config.set_library_file(library_file)
jwnimmer-tri commented 6 months ago

We also have the problem that when ROS Rolling changes the platform to the new Ubuntu Noble we will not have clang-12 anymore ...

Once Ubuntu 24.04 is released, Drake will support it. So this would not actually be a problem in practice.

tfoote commented 6 months ago

I think what I'd failed to realize is that when building in the ROS buildfarm we can't simply use Ubuntu dependencies by their Ubuntu package name? We somehow need to transmogrify all of our dependency names into some kind of package.xml ROS thing?

In general yes this needs to be done. We have a database with thousands of keys resolved across multiple platforms. https://github.com/ros/rosdistro/tree/master/rosdep But our default naming convention is the debian package naming convention so there should be minimal to no changes if you're already using the debian/ubuntu name.

What we've implemented for the ROS packaging process is that we encode the metadata into the single standardized machine readable format (package.xml) and then all the different tools know how to pick up and use that metadata to do what they want. We don't end up with duplicate content, or parallel lists depending on the distribution. The source toolchain can use the same source of truth as the debian build pipeline. This means that the CI is consistent on all targets. There's never an issue that if the developer fixes the focal dependencies they forget to fix the jammy dependencies maintained in parallel here: https://github.com/RobotLocomotion/drake/tree/master/setup/ubuntu/binary_distribution with additional parallel lists here: https://github.com/RobotLocomotion/drake/tree/master/setup/ubuntu/source_distribution

In the package.xml you can flag things with the appropriate flags such as build, build_export, buildtoool, exec, doc, test, as well as conflict and replace. Each of these have different semantics and the tools such as rosdep and bloom know exactly how to treat each of those instead of requiring the user to invoke one of several potential custom scripts maintained for each specific purpose. You can likely significatly reduce the number of lines in your install scripts and configs by using the package.xml and rosdep (which is available upstream as a debian package in Debian and Ubuntu https://packages.ubuntu.com/noble/python3-rosdep2).

For the short term I would suggest that simply duplicating the content into the package.xml once is best for this. The other thing that leveraging the standard database of names is that it's reviewed and validated for compatibility by the whole community.

jwnimmer-tri commented 6 months ago

Okay, I see now https://github.com/ros/rosdistro/blob/master/rosdep/base.yaml. That a pretty simple mapping, so I'm not too worried about interfacing with it, especially because as you say the Debian/Ubuntu package names are often the nominal spelling.

In that case, let me ask a different question ...

I'm translating the dependencies into rosdistro keys for package.xml and anticipated that adding a clang12 specific key to the index might be problematic given that clang already exists and the rosdistro index (resolving to the default clang version in Jammy) and the ROS maintainers probably expect the packages to be using the default version of the compiler.

I see things like this in the current base.yaml:

g++-10:
  debian:
    bullseye: [g++-10]
  ubuntu:
    focal: [g++-10]
    jammy: [g++-10]

Why is adding libclang12-dev in any way "problematic"? Why can't the process of adding Drake's dependencies to the rosdep index be as simple as some stanza like this?

libclang-12-dev:
  ubuntu:
    jammy: [libclang-12-dev]

I guess I don't really mind for Clang, since anyway in a few months Drake will switch to libclang-14-dev, and patching Drake to switch to a different Clang version is pretty easy.

What I'm really trying to understand is whether (and why) Drake would be prevented from using any package that's part of Ubuntu, as a dependency during its ROS build. If we need to limit Drake to use only a subset of Ubuntu, then I need to understand what the rules are so that Drake can follow them moving forward as we ponder adding more dependencies. To date, I'd (perhaps naively) assumed that anything that's part of Ubuntu would be available in the ROS build farm.

For the short term I would suggest that simply duplicating the content into the package.xml once is best for this.

Yes. I think hand-copying the packages-jammy.txt into package.xml is a great way to bootstrap. Down the road we'll want to have a way to keep everything in sync from a single source of truth, but that's a second-order problem for maintainability, not something we need to tackle in the first round.

tfoote commented 6 months ago

I guess I don't really mind for Clang, since anyway in a few months Drake will switch to libclang-14-dev, and patching Drake to switch to a different Clang version is pretty easy.

What I'm really trying to understand is whether (and why) Drake would be prevented from using any package that's part of Ubuntu, as a dependency during its ROS build. If we need to limit Drake to use only a subset of Ubuntu, then I need to understand what the rules are so that Drake can follow them moving forward as we ponder adding more dependencies. To date, I'd (perhaps naively) assumed that anything that's part of Ubuntu would be available in the ROS build farm.

I think that in this case it wouldn't be a problem because the packages are side by side installable and won't conflict. What we are looking to do is make sure that all keys are compatible and we don't fracture the ecosystem. For example if there's two versions of libFoo available from ubuntu version x.y.z is default but you can manually install version a.b.c too. It might conflict with version x.y.z or install side by side, but then when package A links against x.y.z and package B links against version a.b.c a new package that doesn't know about either library can crash out at link time when it links against package A and package B simultaneously. With the large distributed development community we have to make sure that there's only one version in any given distro so that all the packages can work together. Examples of packages that we've had to work hard to maintain specific versions of are VTK, OpenCV, PCL, and Ogre.

As examples here's a recent thread I found with us working to sort out Ogre 1.12 vs 1.9 in focal https://github.com/ros/rosdistro/pull/24448 and https://github.com/ros/rosdistro/pull/24767 And examples of picking various versions of core dependencies for melodic: https://github.com/ros-infrastructure/rep/pull/139 We have to pick and choose versions and thresholds based on what dependencies are available on all the platforms. So our preference is to just use the default on a platform for core things like clang. But we want to make sure that it's consistent and doesn't collide or cause unexpected behavior or incompatibilities. I think in this case libclang is much safer than say substituting clang itself. Though it's even better if it can use the default version so that it will work with everything else the also just uses the default version.

j-rivero commented 6 months ago

Why is adding libclang12-dev in any way "problematic"? Why can't the process of adding Drake's dependencies to the rosdep index be as simple as some stanza like this?

When I mentioned "problematic" I was not speaking about technically problematic for rosdistro to manage the new entry, sorry for not being clear. @tfoote (who has more experience than me into handling rosdistro submissions) has clarified here that he sees no problem on adding it if necessary together with the default clang entry.

I still have the plan of patching the code to use clang-14 if it is trivial. Otherwise, I'll submit the clang-12 entry PR.

jwnimmer-tri commented 6 months ago

What we are looking to do is make sure that all keys are compatible ...

Great, that explanation makes sense, thanks.

I still have the plan of patching the code to use clang-14 if it is trivial. Otherwise, I'll submit the clang-12 entry PR.

In https://github.com/RobotLocomotion/drake/pull/20732, my plan is to switch the Drake build on Jammy to use libclang-14-dev. To unblock the work here until that's fix is merged and released, you could either cherry-pick that patch into this build or use the other patch upthread https://github.com/RobotLocomotion/ros-drake-vendor/issues/2#issuecomment-1874433073.

j-rivero commented 5 months ago

Thanks! I've been using the patch in https://github.com/RobotLocomotion/ros-drake-vendor/issues/2#issuecomment-1874433073 successfully.