RobotLocomotion / ros-drake-vendor

Maintainer scripts that package Drake in the ROS build farm
Other
1 stars 1 forks source link

Drake ROS package integration in the system #8

Closed j-rivero closed 3 months ago

j-rivero commented 5 months ago

The Drake ROS package is going using the ament_vendor function for ROS.

The use case would be:

  1. A user builds or install the drake_vendor (or whatever name we decided in #7) ROS package
  2. A user source usual setup.{bash,sh,bat} script (typically /opt/ros/${ROS_DISTOR}/setup.bash when using binary packages on Linux) that enables all the ROS integration mechanisms into the user session (added "vendored" bin/ to the PATH and lib/ to LD_LIBRARY_PATH mainly).

For getting Drake using drake_vendor downstream software is expected to find_package(drake_vendor). This acts as a sort of acknowledgement that the downstream software is consuming the vendor package and not the system package and It gives the vendor package an opportunity to inject CMake logic (in our case we probably want to inject find_package(drake) to be run automatically when calling find_package(drake_vendor).

Summarizing we would expect users to:

  1. Someone install/builds the drake_vendor
  2. source the /opt/ros/${ROS_DISTOR}/setup.bash
  3. Use in his/her CMakeLists.txt (being a ROS package or a plain CMake package) find_package(drake_vendor) instead of the find_package(drake PATHS /opt/drake), like:

    +++ b/drake_ament_cmake_installed/src/drake_ament_cmake_installed/CMakeLists.txt
    @@ -12,7 +12,7 @@ set(CMAKE_CXX_STANDARD 17)
    set(CMAKE_CXX_STANDARD_REQUIRED ON)
    
    find_package(ament_cmake CONFIG REQUIRED)
    -find_package(drake CONFIG REQUIRED PATHS /opt/drake)
    +find_package(drake_vendor CONFIG REQUIRED)

There is an alternative way for the system integration that involves the use of the GLOBAL_HOOK of the ament_vendor function. That parameter is going to add to system CMAKE_PREFIX_PATH the location of the installed vendor package in this case /opt/ros/rolling/opt/drake_vendor when the user source /opt/ros/${ROS_DISTOR}/setup.bash.

That is going to make that existing examples like drake_cmake_installed_apt will work automatically without any change since CMake will locate the drake-config.cmake directly from the ROS installation.

This alternative presents some problems and limitations. In the presence of a system installation of drake in /opt/drake a ROS user of drake_vendor can be perfectly confuse about when system installation or the ROS installation are going to be used. We also lost the ability to inject custom logic to the entry path of find_package(drake_vendor).

I would prefer to use the non GLOBAL_HOOK option but @tfoote, @jwnimmer-tri want to be sure that if fits well the initial goal of the package.

jwnimmer-tri commented 5 months ago

I have no experience with ament nor these kinds of questions, so mostly I'll just defer to you guys to choose what is best.

My overall goal is that people who are regular ROS users and familiar with how to build ROS-adjacent software can easy add and use Drake as a dependency in their own project. Generally I agree with the principle that we would try to avoid situations where the user would be confused about what's happening, or end up accidentally using the wrong version of a dependency, but I don't have a sense of how risky various options would be, nor the usability upsides / downsides of any option.

I'll also \CC in @IanTheEngineer and @calderpg-tri in case you'd like to offer any feedback on the OP.

tfoote commented 5 months ago

I agree with your assessment that I would recommend that we use the isolated version and ask people using drake in the ROS environment to use find_package(drake_vendor) If that's the only difference then it's a relatively simple to document.

It's a little frustrating as I suspect that you'll see people writing to be generic which isn't great. But I think that it will be better than people accidentially getting the version that they weren't expecting which is

if(ROS_VERSION)
    find_package(drake_vendor)
else()
    find_package(drake CONFIG REQUIRED PATHS /opt/drake)
endif()

We could put this sort of recommendation in for use with ROS clearly in the docs so that people can make things dual compatible.

calderpg-tri commented 5 months ago

My initial reaction is that using drake_vendor is just a loss in usability.

As I see it, there are three possible use cases to consider:

  1. User consumes ROS-provided binary Drake
  2. User consumes upstream binary Drake
  3. User consumes a local built-from-source Drake

I don't see why they should need to write different find_package calls for those three cases when a significant feature of ROS workspace structure/convention is that you can have both package foo that ships in binary form, and a local overlay of that package you build from source, and your user code does not need to know the difference. CMake directly (and Colcon + ament) already provide the affordances necessary to order and select which version you want to find.

One of the closest existing examples I can think of is how OMPL is used in MoveIt. MoveIt simply calls find_package(ompl REQUIRED), there is no ompl_vendor despite this having the exact same risks as mentioned here with Drake. If you want to build MoveIt against a different version of OMPL than the ROS binaries while having the binaries installed, you need to set CMAKE_PREFIX_PATH appropriately. If find_package(ompl REQUIRED) is good enough for MoveIt, I don't see a strong argument that find_package(drake REQUIRED) is problematic.

j-rivero commented 5 months ago

One of the closest existing examples I can think of is how OMPL is used in MoveIt. MoveIt simply calls find_package(ompl REQUIRED), there is no ompl_vendor despite this having the exact same risks as mentioned here with Drake.

The ompl case is bringing to the scene a third possible road (1. and 2. explained in the initial comment of this issue) which might be the only option if the Drake team want to keep the use of find_package(drake) for the usability reasons that you explained well above @calderpg-tri:

  1. Use a drake_vendor package with the ament_vendor GLOBAL_HOOK enabled
  2. Use a drake_vendor package without the ament_vendor GLOBAL_HOOK
  3. Consider Drake upstream as a ROS Package This is the case of ompl (and others) where the upstream sources host a curated version of a package.xml. Some points about using this case:
    • The maintenance and responsibility of the ROS package relies in the upstream maintainers differently from the vendor packages where generally different ROS teams update, maintain and release the vendor package.
    • We lost the ability of applying patches to the upstream code using the ament_vendor interface (like the ones we hosted in this repo under #6)
    • If at some point we need to add some ROS/ament CMake code for packaging, that should be done in the upstream code.

Let me add that the ompl is to my eyes an unusual/advance case of a ROS package. It does not host the generated CHANGELOG.rst and its configuration for generating packages seems to be adapted to ask the user information about versions, differently from the standard use case of handling the whole release process automatically by the ROS tools (in particular catkin_generate_changelog and caktin_prepare_relase) and recommend in the documentation.

tfoote commented 4 months ago

For reference there's related discussion in #7

jwnimmer-tri commented 4 months ago

I don't have the lived experience of a software developer who uses ROS, so I can't offer much specific feedback here.

My overall goal is still just the main idea of https://github.com/RobotLocomotion/ros-drake-vendor/issues/8#issuecomment-1910877512 -- for a software developer who customarily works in the ROS ecosystem, we want Drake to be easily available in a familiar way. I must to defer to @j-rivero and @tfoote (and @calderpg-tri and @IanTheEngineer) to speak to the spellings that best accomplish that. This is the specific reason we've executed the SOW -- to hire the best experts in the world to solve the problem the right way.

I can perhaps offer one simplification of the requirements though:

In the presence of a system installation of drake in /opt/drake a ROS user of drake_vendor can be perfectly confuse ...

I think we should declare that out-of-scope.

If the user is using ROS, we can assume that they have neither unpacked a https://drake.mit.edu/from_binary.html tarball as root into /opt/drake, nor that they have added the https://drake.mit.edu/apt.html site to their sources list and installed from there. At most, they might have some Drake artifacts lying around somewhere in their home directory. For the purposes of choosing the best mechanism for ROS users, "accidental mis-use of a packages already installed system-wide" should be heavily de-prioritized.

This is not to say that such a situation will never occur -- it's easy enough to imagine that can happen occasionally. I'm just saying that I expect it to be extremely rare, and I would much rather prioritize the everyday users who don't have that situation.

tfoote commented 4 months ago

This is not to say that such a situation will never occur -- it's easy enough to imagine that can happen occasionally. I'm just saying that I expect it to be extremely rare, and I would much rather prioritize the everyday users who don't have that situation.

If we do end up with this we need a really good way to detect and provide a strong message about the conflict and how to resolve it.

Unfortunately although for the average user this is likely low probability of collision. But I think that there's two users for whom this will be a major problem. The first is the new potential user who finds the Drake project online. And decides to try out drake following the instructions on the website. And after trying it finds further documentation of the ROS bindings, so the next thing that they do is install the ROS version and then it doesn't work because they found this corner case.

And the second important case is a core drake developer who already has drake installed on their system and is actively developing. And then they want to test out the ROS package and maybe see how things are working there. Asking them to clear/uninstall their already installed workspace to be able to test the ROS version is going to be a non-starter.

I had a brief conversation with @IanTheEngineer about this and he reiterated that he's value would be that we just use find_package(drake) to keep the user experience. The whole point of using find_package infrastructure is that you can override and enable the user of different versions of packages by adjusting the CMake search path. I think that the issue #7 should be resolvable with it colliding with an internal representation/generated content. Where we'll have to absorb that and potentially integrate and extend (maybe internally rename and include the file during the build) to enable us to extend the capabilities with the extra exports that ROS provides.

The bigger challenge is the potential collisions with the upstream installation. It should be perfectly reasonable to have multiple installations and we need to make sure that the CMake prefix path is setup correctly to find our version before the system version. The one challenge is that if the "upstream" recommendation that everyone is using doesn't allow that. For example having /opt/drake hard coded into the CMakeLists.txt of default packages. Is this currently the recommendation? Can we update the recommended path for people using drake to inject /opt/drake into their CMAKE_PREFIX_PATH instead of hard coding it into the CMakeLists.txt? Thus we would avoid the conflict if we're controlling the CMAKE_PREFIX_PATH for the ROS installation via our env hooks.

jwnimmer-tri commented 4 months ago

(Replying slightly out-of-order.)

And the second important case is a core drake developer who already has drake installed on their system and is actively developing. And then they want to test out the ROS package and maybe see how things are working there. Asking them to clear/uninstall their already installed workspace to be able to test the ROS version is going to be a non-starter.

Maybe I'm not understanding this part, but I don't think this is relevant. No Drake developer installs Drake into /opt as root on their system, that would be completely insane. We all just build/test Drake in arbitrary subdirectories inside our home directory (e.g., in my case everything Drake-related lives in $HOME/jwnimmer-tri/drake), which is nowhere that CMake / ROS / Bloom anything should ever look for, so there can never be a conflict. (We don't set LD_LIBRARY_PATH or PATH or anything either, that CMake might get a hint from.)

Unfortunately although for the average user this is likely low probability of collision. But I think that there's two users for whom this will be a major problem. The first is the new potential user who finds the Drake project online. And decides to try out drake following the instructions on the website. And after trying it finds further documentation of the ROS bindings, so the next thing that they do is install the ROS version and then it doesn't work because they found this corner case.

For clarity -- the only way a Ubuntu user of Drake would end up with Drake installed in /opt on their root filesystem using our website instructions is if they used our https://drake.mit.edu/apt.html package. We do not tell anyone to sudo make install nor sudo tar xfz drake-binary.tgz. The only thing users should be running under sudo is apt.

I'm not sure what your "doesn't work" would look like here, but in any case if we really need good way to "detect and provide a strong message" we could consider marking Conflicts: on the ROS *.deb against the https://drake.mit.edu/apt.html package name (drake-dev). I guess I'd want to see what the situation looks like before we go that far, though. Hopefully there is a simpler solution, of just having the build output be clear which Drake was selected, and a cheat sheet for how to fix it in case it was the wrong one.

The one challenge is that if the "upstream" recommendation that everyone is using doesn't allow that. For example having /opt/drake hard coded into the CMakeLists.txt of default packages. Is this currently the recommendation?

I think Drake is agnostic on this front? However people want to organize their workspace to find packages, they should feel free to do that. We do have some examples at https://github.com/RobotLocomotion/drake-external-examples with sample recipes. If we need to improve those examples to be less of a footgun for future ROS users, that's fine by me. They are just samples, not legally binding.

tfoote commented 4 months ago

After a ping from @j-rivero I've come back to this.

If we're only worried about /opt/drake coming from a deb package then I think that we can relatively easily believe that we can add the debian package conflict to prevent those being accidentally used together by using the Conflicts keywork.

I would rather not prevent people from side by side installations, but since we've reduced the scope of conflicts to the apt packages then reinstalling on the other side is relatively low cost too.

With that we would want to get buyin to simultaneously change how it's is recommended and documented to add drake to the CMake search path instead of hard coding it in individual packages. @jwnimmer-tri you're right that we can make sure that the console output is useful and makes it easier to fix. But the challenge is that this isn't guarentteed to be fixable becaues it may be embedded into someone else's library that you're depending on, potentially even if installed via binary which is not in your source workspace so may not be fixable. So fixing this practice as quickly and as universally in the Drake documentation as possible will be valuable. And potentially even doing a communication effort to actively change people's usage patterns would be good.

In the long run if everyone changes their behavior to not inject the path in code, but through the environment we could actually eventually remove the conflicts down the line as the two install locations are theoretically not in conflict and a user could put either onto their path and find_package can be guided.

And @j-rivero you're right that we may have to create a custom drake config that proxies the upstream config transparently.

jwnimmer-tri commented 4 months ago

So fixing this practice as quickly and as universally in the Drake documentation as possible will be valuable.

Fine by me.

... we would want to get buyin to simultaneously change how it's is recommended and documented to add drake to the CMake search path instead of hard coding it in individual packages.

Where are the docs need fixing, and in what way should they be changed? (If it's easier to open PRs with the changes, that's fine by me too.)

Here's my guess as to what you mean:

It is just these two lines?

https://github.com/RobotLocomotion/drake-external-examples/blob/1e71ba322ac713bd3e19d56b729e4bb630347d44/drake_ament_cmake_installed/src/drake_ament_cmake_installed/CMakeLists.txt#L15

https://github.com/RobotLocomotion/drake-external-examples/blob/1e71ba322ac713bd3e19d56b729e4bb630347d44/drake_catkin_installed/src/drake_catkin_installed/CMakeLists.txt#L15

I see some text in the near README files about --cmake-args and CMAKE_PREFIX_PATH.

So maybe the fix we want is to remove the PATHS /opt/drake from the CMakeLists, and make the requirement to have a working CMAKE_PREFIX_PATH established be mandatory, with no caveat related to /opt/drake?

tfoote commented 4 months ago

Yeah, we should be teaching people to set the CMAKE_PREFIX_PATH or providing convenient utilities to do that. (Aka one of the primary roles of the ROS setup.(ba)sh files.

jwnimmer-tri commented 4 months ago

It sounds like everything will be fine, in that case.

The drake-external-examples demos show users examples of how to use Drake in their build system / package manager of choice. The two examples I linked above were "ament cmake" and "catkin installed", which demo how to use Drake when ROS tools are the users's build system / package manager of choice. Under this SOW, we'll be adding Drake packages to ROS's apt repositories directly, and therefore those ROS examples will be rewritten / replaced / refactored to refer to the new ROS debs, in lieu of the old Drake Apt Site debs. At the very latest, the ill-conceived advice will vanish at that time.

If you'd prefer to open a PR to tweak the advice now, so that we have a couple less months of bad advice hanging around, that's OK by me. But I'd also be OK if you prefer to wait to fix the examples until after Drarke's ROS debs are full available.

j-rivero commented 3 months ago

Thanks everyone for the feedback and the hints.

And @j-rivero you're right that we may have to create a custom drake config that proxies the upstream config transparently.

Keeping the same drake name for this repository and the underlying upstream package was something new to deal with but I think that I found a solution that can work with ROS packaging being installed system wise for ros-${rosdistro}-drake and for the colcon workspaces, I wrote detailed information in #7 and implemented it in #6.

Going to close this but please feel free to reopen if more work needs to be done.