RobotLocomotion / ros-drake-vendor

Maintainer scripts that package Drake in the ROS build farm
Other
1 stars 1 forks source link

lcm packaging (libjchart2d-java not available in rosdistro) #3

Closed j-rivero closed 2 months ago

j-rivero commented 6 months ago

Drake visualizar seems to use the libjchart2d-java dependency. The buildsystem is not able to progress without it:

**[852 / 4,687] Compiling lcmgen/emit_go.c; 0s processwrapper-sandbox ... (2 actions, 1 running)
ERROR: /home/buildfarm/.cache/bazel/_bazel_/9ddd0c5aef46ee293530aad7ebd0c1db/external/net_sf_jchart2d/jar/BUILD.bazel:2:12: Extracting interface @net_sf_jchart2d//jar:jar failed: missing input file '@net_sf_jchart2d//jar:jchart2d.jar'
ERROR: /home/buildfarm/.cache/bazel/_bazel_/9ddd0c5aef46ee293530aad7ebd0c1db/external/net_sf_jchart2d/jar/BUILD.bazel:2:12: Extracting interface @net_sf_jchart2d//jar:jar failed: 1 input file(s) do not exist
Target //:install failed to build
Use --verbose_failures to see the command lines of failed build steps.
ERROR: /home/buildfarm/.cache/bazel/_bazel_/9ddd0c5aef46ee293530aad7ebd0c1db/external/net_sf_jchart2d/jar/BUILD.bazel:2:12 Extracting interface @net_sf_jchart2d//jar:jar failed: 1 input file(s) do not exist
INFO: Elapsed time: 187.499s, Critical Path: 1.86s
INFO: 1139 processes: 490 internal, 649 processwrapper-sandbox.
FAILED: Build did NOT complete successfully
**

How is the software being used by the Drake visualizer? Is it a runtime dependency or somehow interacts with the build system?

jwnimmer-tri commented 6 months ago

It's a dependency of "lcm-spy" (bazel run //lcmtypes:drake-lcm-spy) to browse messages.

Probably the same root cause as https://github.com/RobotLocomotion/ros-drake-vendor/issues/2#issuecomment-1874353403 -- failure to install the build-deps before building.

j-rivero commented 6 months ago

Probably the same root cause as #2 (comment) -- failure to install the build-deps before building.

Sorry I was not clear on the issue summary, let me rephrase the question: being libjchart2d-javaa mandatory dependency of Drake, is it absolutely required to build the software at building time? is it only required at runtime?.

I'm double checking what kind of dependency is it and if we can really need to add an index entry for it in rosdistro (which is the index of software packages used by ROS when resolving dependencies) or we it is somehow a runtime optional/non-critical piece of Drake.

jwnimmer-tri commented 6 months ago

The lcm-spy tool is a GUI that ends up being very helpful for debugging when using LCM for messages. The tool isn't 100% required, but do I think it would be a nice service to our users to keep it around.

If it helps, we could defer it to a later milestone so that we could ship the more critical pieces first, without this dependency being a blocker.

... required to build the software at building time? is it only required at runtime?

For completeness, note that there are actually four jar files we would need:

Whether or not those jars are required would be the same for built-time and run-time. If we want the ROS package to provide the lcm-spy command, then all jar files are required at both build-time and run-time. If we want to opt-out of this tool, then neither build-time nor run-time would need any of the jars.

On macOS, the jar files are not available in brew, so Drake already contains logic to download the jar files on the fly and install them into $prefix alongside lcm-spy.

Therefore, it's probable that the following patch would unblock the build errors:

diff --git a/tools/workspace/java.bzl b/tools/workspace/java.bzl
index a6b8372772..e412bf1933 100644
--- a/tools/workspace/java.bzl
+++ b/tools/workspace/java.bzl
@@ -14,7 +14,7 @@ def _impl(repo_ctx):
     build_content = """\
 package(default_visibility = ["//visibility:public"])
 """
-    if os_result.target in repo_ctx.attr.local_os_targets:
+    if False:
         is_local = True
         filename = basename(repo_ctx.attr.local_jar)
         repo_ctx.symlink(

The patch completely disables using jar files from /usr. Instead, they are always downloaded and then installed into $prefix, no matter the OS.

jwnimmer-tri commented 5 months ago

Update -- I just noticed that Ubuntu already distributes the GUI tools we care about (see https://packages.ubuntu.com/jammy/liblcm-bin).

Perhaps we should just be relying on those, instead of having Drake rebuild them from source and install them? The question would be if the 8-year old version in Jammy and Noble (LCM 1.3.1) is suitable. There's LCM 1.5.0 released in 2023 that might have important bug fixes.

I wonder how hard it would be to get Noble bumped to 1.5.0 before it freezes? I suppose that means upgrading in Debian, and them importing into Ubuntu before its freeze at the end of February? I don't see any Debian bug (https://bugs.debian.org/cgi-bin/pkgreport.cgi?repeatmerged=no&src=lcm) requesting a new version.

Are there any ROS people qualified as Debian developers who could help move that along? Or other practices / habits where ROS could make available a newer version of the package itself?

j-rivero commented 5 months ago

Update -- I just noticed that Ubuntu already distributes the GUI tools we care about (see https://packages.ubuntu.com/jammy/liblcm-bin).

Perhaps we should just be relying on those, instead of having Drake rebuild them from source and install them?

That would help to simplify the Drake building process indeed.

The question would be if the 8-year old version in Jammy and Noble (LCM 1.3.1) is suitable. There's LCM 1.5.0 released in 2023 that might have important bug fixes.

For using a fixed version of dependencies I would recommend to choose a version that is well tested by your software and you are quite sure that it worked fine during months since the decision of using system dependencies is most of the times difficult to revert and the system dependencies on Ubuntu did not change unless severe/security bugs appear. I see that Drake is using 1.5.0.

I wonder how hard it would be to get Noble bumped to 1.5.0 before it freezes? I suppose that means upgrading in Debian, and them importing into Ubuntu before its freeze at the end of February?

That is the usual path, upgrade Debian and them Ubuntu will directly sync from Debian before the Import Freeze happen (February 29th).

I don't see any Debian bug (https://bugs.debian.org/cgi-bin/pkgreport.cgi?repeatmerged=no&src=lcm) requesting a new version.

That could be the first step indeed. I see that the software is managed by the Debian QA group which can probably accept a PR updating the software https://tracker.debian.org/pkg/lcm

Are there any ROS people qualified as Debian developers who could help move that along? Or other practices / habits where ROS could make available a newer version of the package itsel

I've a long tradition of working and maintaining packages on Debian https://qa.debian.org/developer.php?login=jrivero@osrfoundation.org, I can probably take a look into how hard would to update lcm on Debian, time permitting.

jwnimmer-tri commented 5 months ago

For using a fixed version of dependencies I would recommend to choose a version that is well tested by your software ...

I agree. So let's NOT plan to try to use 1.3.1 at all.

That could be the first step indeed. I see that the software is managed by the Debian QA group which can probably accept a PR ... I've a long tradition of working and maintaining packages on Debian ...

I went back-and-forth on this a lot. On the one hand, I think it would be a helpful contribution and I would support funding it in the SOW. On the other hand, it has a very short deadline in order to hit Ubuntu 24.04 cut-off, but doesn't actually help us with ROS Humble which is our near-term goal of this SOW.

Let me propose a different approach and see what you and @tfoote think:

We could aim to package https://github.com/lcm-proj/lcm into ROS Rolling (and eventually Humble), in a similar spirit as Drake -- as a vendor package in the build farm. If we find problems with the upstream build system CMakeLists, we would plan to submit those upstream. After we have that working and the experience under out belt, we can move on to upgrading the package in Debian Sid as discussed. That probably wouldn't make it in time for Ubuntu Noble merge window, but oh well. If / when the OS package ever catches up, we could have the vendor package fall back to it, as I understand it.

To sum up: near term, the Drake vendor package would also build & install the LCM library, tools, etc using our work-around patch discussed upthread. Medium term, the aim would be to unbundle the LCM build into its own ROS package. Long term, we'll upstream those improvements and try to get Sid in better shape, time permitting.

WDYT?

tfoote commented 5 months ago

Packaging LCM into ROS like that, overriding a system version, is one of our anti-goals. If we were to do that we would break any package that depends on the system version with runtime builds. And we would consequently have to build all of them in our tree/workspace. This can even worse get into runtime crashes if someone links against the system version and then we try to link against their library. Which means that we effectively have to make ourselves incompatible with the system version and uninstall it to prevent crashes.

We then have to create the new recommended way for people to include/depend on lcm that we will keep in the rosdistro for one cycle, and then deprecate for the next distro and support for several years while we tell everyone to switch back off of the shim/vendor package that we just built and told them all to use. This will cause multiple years of headaches and extra documentation user confusion, and people asking us why we didn't just use it the "usual" way.

To that end, if there's one package that needs the capability our general policy is that they should use it in an isolated/embedded way that won't collide with external users. The headers shouldn't get into external include paths, and the libraries should be isolated to not be exposed and potentially cause linking collisions for others. Drake's static build helps a lot with this.

So I would strongly suggest that we push to get lcm 1.5.x into Ubuntu Noble if possible. Obviously that won't work for Humble, but we'll likely want to move forward in the not too distant future. And if we can know that building it inside is scoped with an expected end date is nice. But I'd also push back that just because there's a newer version available and "might have important bug fixes" is a very weak reason for trying to upgrade. If there are "important" bug fixes that are needed they usually will be known/discovered.

This is an important part of being in a distro. We have to move with and coordinate with the whole community. And in general there is always things that are available "newer" than what's in the distro, but we have to take the lowest common denominator of all our supported platforms to consider it "available". Individual projects always have a short list of things that they want newer than is the stable versions. But if all projects want their specific preferences bleeding edge, with enough project requests the whole system is bleeding edge and we've lost all the stability. I know it's hard to wait but it's how we can make things accessible.

jwnimmer-tri commented 5 months ago

Those are good points. I hadn't considered the runtime library side.

My note wasn't completely clear, so let me clarify a few things before going back to the main points.

Drake will nominally always vendor its own patched build of LCM's runtime library an internal dependency (libdrake_lcm.so) -- with different symbol names than upstream lcm, and without any headers provided. (The symbol-renamed vendoring has been an ongoing project, and should reach completion in the v1.26.0 release next month.)

In my note above, I was focusing on the command-line tools (but I didn't communicate that clearly). The relevant Debian package is liblcm-bin with the following command-line tools:

Those are the programs that induce the dependency on libjchart2d-java which is at question in this thread. (To be clear, not all of them use jchart2d, but lcm-spy does.)

For users debugging Drake projects, especially lcm-spy is useful to monitor IPCs in realtime.

So the question is how to allow users to run lcm-spy (and ideally the other programs, too) in the ROS ecosystem. I see four options:

(1) The ros-drake-vendor build would not install any LCM command-line tools. Instead, we advise the user to install liblcm-bin from Ubuntu, as-is. That would give them 1.3.1.

(2) The ros-drake-vendor build would not install any LCM command-line tools, but we also work to get 1.5.0 into Debian so it trickles down into Noble. Advise the user to install liblcm-bin from Ubuntu, which would give them 1.3.1 on Jammy/Humble but 1.5.0 on Noble/Rolling/etc.

(3) The ros-drake-vendor build installs the LCM command-line tools, but relies on Ubuntu to provide runtime dependencies like jchart2d. I think this means we would need to add entries for those dependencies into rosdep.

(4) The ros-drake-vendor build installs the LCM command-line tools, as well as their runtime dependencies like jchart2d. (This is what the draft #6 is currently doing.) This has the downside that we're bundling libraries that already exist (with sufficiently new versions) in Ubuntu into the ros-drake-vendor output.

I think using option (4) is a fair posture for now, and we can start building packages under that assumption. The question is then in which direction (if any) we want to try to improve to have less bundling over time. Whatever ends up being available via rosdep we could stop bundling into ros-drake-vendor output.

In that light, I guess the bundling of jchart2d on its own isn't a major downside. If we're bundling lcm-spy already, we might as well bundle its dependencies -- so option (3) isn't much benefit.

In that case, I think it's fair to aim for option (2) as the next improvement -- trying to get lcm 1.5.0 into Sid sooner than later. In the meantime, we can just plan to bundle lcm-spy & related directly into our Drake packages.

tfoote commented 5 months ago

Ahh sorry I missed an understanding of the scope.

Stepping back a bit, my first thought is that LCM debugging tools shouldn't be bundled into drake. An optional additional debugging command line tool like this is clearly easily installable afterwards separately. Us needing to run down this dependency shows that by coupling this one tool into the main build we're now making all the users install extra dependencies which otherwise would be optional.

This is especially true as the LCM installation doesn't have any dependencies on anything we've built. Since this is truely decoupled from our build it should definitely be separate.

I took a look at the release notes since 1.3.1: https://github.com/lcm-proj/lcm/releases and they appear to mostly be minor with a few new feature in 1.4.0. So I would lean towards option 2 by default. Unless there's a known issue specifically we shouldn't take on the burden of building and releasing this. Also as there's already potentially conflicting binaries on the system we shouldn't install to the standard path and or use the same executable names is someone is relying on existing systems using the lcm binaries in 1.3.1 on their system.

That said if there is use cases related to Drake that are hitting these blocker issues using 1.3.1 we can work to support those users. Obviously a request to push the upstream forward to 1.5 would make sense generally. But that won't get to every target platform. So if it's more urgent, we can easily set this up for the user who wants this capability to download and install a newer verison of lcm on their system. So if it's a blocker, I would recommend easily setup an Ubuntu PPA to build liblcm-bin 1.5 for any of our target platforms, ala Jammy and maybe Noble if it doesn't go upstream in time. This will work for all Ubuntu based ROS users not just the ROS ones, and will reduce the size of the build and installation of Drake itself. And at the time that liblcm-bin does reach 1.5.x upstream on all platforms packaging and building this will just drop away from the drake communities responsibility.

Note that the PPA with 1.5.0 in it is also a very good way to test out and demonstrate functionality for submitting 1.5.0 upstream to Ubuntu and debian too.

j-rivero commented 5 months ago

I took a look at the release notes since 1.3.1: https://github.com/lcm-proj/lcm/releases and they appear to mostly be minor with a few new feature in 1.4.0. So I would lean towards option 2 by default.

+1. I was looking into the differences from 1.3.1 to 1.5.0 from a quantitative approach and there are more than 500 commits and more than 500 files changed although as Tully said the changelog does not display really intrusive changes.

That said if there is use cases related to Drake that are hitting these blocker issues using 1.3.1 we can work to support those users.

If I understand the proposal correctly this will transform into changes:

  1. We drop the lcm build and installation from the Drake ROS package and we add the runtime dependency on system lcm (this will be 1.3.1 by now)
  2. We plan to work actively into getting 1.5.0 into Debian. Works can be accelerated if problems arise on using the 1.3.1 version setting up a PPA to have a quick fix for the problems.

Anyway I've sending the initial bug to consider the update of the lcm on Debian https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1061300. I don't have too much hope but maybe someone among the maintainers is willing to help and speed up things.

@jwnimmer-tri let me know if this works for you.

jwnimmer-tri commented 5 months ago

We plan to work actively into getting 1.5.0 into Debian.

Yes, I support this in terms of being a good community steward. An updated Debian Sid package seems worthwhile no matter what other things we do (so long as it fits in our budget).

I also posted https://github.com/lcm-proj/lcm/issues/486 to the LCM upstream in case they have any comments or want to help.

Also as there's already potentially conflicting binaries on the system we shouldn't install to the standard path and or use the same executable names is someone is relying on existing systems using the lcm binaries in 1.3.1 on their system.

Does #6 install these binaries in a way that would hazard this confusion? In other words, is /opt/ros/{dist}/opt/drake/bin added to the user's shell $PATH by default? (I haven't looked carefully yet.)


Now, back to question of what to do in the interim:

I tried running some of the tools from Jammy (1.3.1). At least 3 of the 5 tools are broken in those packages (lcm-spy, lcm-logger, lcm-logplayer-gui). It seems like one works fine (lcm-gen). For the last one (lcm-logplayer), it's the least important for us so I didn't try running it -- from the commit logs it does have a low-impact bug fix for mishandled command-line arguments, but I won't call that a deal breaker. In short, 2 of the 5 tools work okay, and 3 of the 5 are broken including the most important one (lcm-spy). Not entirely surprising, given that it's an orphaned package with a narrow userbase.

In terms of a PPA, that is the worst of all worlds. The entire point of this SOW is to make it easy for ROS users to run Drake without extra complications. If they need to add a PPA to have working tools, then we've missed the mark. (To to clear, if you want to point me at a PPA to beta-test some proposed packages, that's still fine. It's only the workflow for end-users where the PPA is a non-starter.)

So, let me try to be specific to avoid too much much back-and-forth time here:

Option (4) -- which is currently implemented in the draft (PR #6) -- is a fine solution from my end. It seems to be the only option that gives users working tools, without undue hassle. Is it acceptable to ship it that way in Rolling? And then eventually Humble?

Let's try to identify any acute problems with option (4), and iterate on those to see if we can make it work.

Later, if we see that 1.5.0 has made it into Noble, we can amend the ROS Drake packages to cull the redundant 1% from the package download by using the system binaries.

tfoote commented 5 months ago

We can start with (4) as proposed, but moving it back to (3) would be valuable. Adding the rosdep keys for existing packages is trivial. It takes ~15 minutes of cross checking package versions on different platforms, and then it's one less thing bundled in.

The important things to check for (4) is that we won't break anyone who is linking against the system lcm version and against drake. And secondly we need to make sure that we don't break people using the lcm-* executables on the command line. If they are fully backwards compatible it's probably ok to just add it to the path in an alternative location. But if there's differences we should give them a different name so that people can know which ones they're getting and be able to select whether they want the bundled one or the standard one.

In terms of a PPA, that is the worst of all worlds. The entire point of this SOW is to make it easy for ROS users to run Drake without extra complications. If they need to add a PPA to have working tools, then we've missed the mark.

I understand this. But we also need to keep in mind that the majority of lcm users are not Drake users and we need to not break them to make the Drake user experience better.

j-rivero commented 5 months ago

I tried running some of the tools from Jammy (1.3.1). At least 3 of the 5 tools are broken in those packages (lcm-spy, lcm-logger, lcm-logplayer-gui). It seems like one works fine (lcm-gen). For the last one (lcm-logplayer), it's the least important for us so I didn't try running it -- from the commit logs it does have a low-impact bug fix for mishandled command-line arguments, but I won't call that a deal breaker. In short, 2 of the 5 tools work okay, and 3 of the 5 are broken including the most important one (lcm-spy). Not entirely surprising, given that it's an orphaned package with a narrow userbase.

Not entirely surprising indeed. I've started to look a bit more into updating the lcm package for Debian. It needs a bit more of love but I can probably have a working package and after that a QA-ready for debian submission in a matter of days. Not sure if we will get the train before Noble, but I'll give it a try.

jwnimmer-tri commented 5 months ago

We can start with (4) as proposed, but moving it back to (3) would be valuable. Adding the rosdep keys for existing packages is trivial. It takes ~15 minutes of cross checking package versions on different platforms, and then it's one less thing bundled in.

Fine by me.

(I think because the libclang-12-dev debate was my first encounter with trying to add existing Ubuntu packages into the rosdep index, I misapprehended the median difficulty of adding things. Packages that ship multiple concurrent versions in Ubuntu are where the difficulty grows.)

The important things to check for (4) is that we won't break anyone who is linking against the system lcm version and against drake.

This requirement falls on the Drake side, to ensure that our build of the C code is sufficiently safe. We can revisit and confirm this once candidate packages are ready.

And secondly we need to make sure that we don't break people using the lcm-* executables on the command line. If they are fully backwards compatible it's probably ok to just add it to the path in an alternative location. But if there's differences we should give them a different name so that people can know which ones they're getting and be able to select whether they want the bundled one or the standard one.

I asked about this upthread, but didn't have any progress yet. Re-quoting:

Also as there's already potentially conflicting binaries on the system we shouldn't install to the standard path and or use the same executable names is someone is relying on existing systems using the lcm binaries in 1.3.1 on their system.

Does #6 install these binaries in a way that would hazard this confusion? In other words, is /opt/ros/{dist}/opt/drake/bin added to the user's shell $PATH by default? (I haven't looked carefully yet.)

In any case, I'm happy to revisit the question of java binaries once candidate packages are ready. It will probably be easier to discuss specifics once we have testable artifacts ready.

I understand this. But we also need to keep in mind that the majority of lcm users are not Drake users and we need to not break them to make the Drake user experience better.

I agree that our new package breaking other packages would be a critical bug, and solving those problems would be a prereq of shipping in Rollling.

The intent behind option (4) is that everything useful is made available, but off to the side in a way that doesn't interfere.

... updating the lcm package for Debian ...

Sound good. FYI the lcm maintainers seem happy to help if there are any build-error questions.

j-rivero commented 5 months ago

Took a bit longer than expected but lcm-1.5 packaging for Debian/Ubuntu is ready. The PR to update the debian package is in place: https://salsa.debian.org/debian/lcm/-/merge_requests/1. Need to start pinging bugs and issues to see if we gain an sponsor for it.

Meanwhile, I've setup a PPA for you to test @jwnimmer-tri on Ubuntu Jammy: https://launchpad.net/~j-rivero/+archive/ubuntu/lcm-1.5 . It would be great if you can give the package a try, probably install liblcm-bin should be enough.

jwnimmer-tri commented 4 months ago

Thanks, I've tested and it looks good to me.

I installed sudo apt install liblcm-bin at Version: 1.5.0+repack-5.

I ran these programs under some basic conditions and they worked as expected:

/usr/bin/lcm-gen
/usr/bin/lcm-logger
/usr/bin/lcm-logplayer
/usr/bin/lcm-logplayer-gui
/usr/bin/lcm-spy
/usr/bin/lcm-tester
j-rivero commented 4 months ago

Thanks, I've tested and it looks good to me.

Thanks @jwnimmer-tri ! Good to know that is working. I hope to find a Debian sponsor in the following days/weeks.

j-rivero commented 4 months ago

Let's also try asking Ubuntu directly to update the 1.5 packages on Noble. https://bugs.launchpad.net/ubuntu/+source/lcm/+bug/2053256

j-rivero commented 3 months ago

Feature Freeze Exception send to Canonical to get lcm 1.5 on Noble. Accepted by the release team. Still to be merged but we should be good on track https://bugs.launchpad.net/ubuntu/+source/lcm/+bug/2053256 .

j-rivero commented 3 months ago

Update: canonical developers requested a merge request to peer review the changes. We are on it: https://code.launchpad.net/~j-rivero/ubuntu/+source/lcm/+git/lcm/+merge/463155

j-rivero commented 2 months ago

After more efforts than expected, the lcm 1.5 package is ready on Ubuntu Noble. https://packages.ubuntu.com/source/noble/lcm

tfoote commented 2 months ago

Great work @j-rivero !