SteveMacenski / spatio_temporal_voxel_layer

A new voxel layer leveraging modern 3D graphics tools to modernize navigation environmental representations
http://wiki.ros.org/spatio_temporal_voxel_layer
GNU Lesser General Public License v2.1
647 stars 190 forks source link

Impossible to build for Ubuntu Jammy 22.04 / Humble (probably because of OpenVDB / TBB) #232

Closed doisyg closed 9 months ago

doisyg commented 2 years ago

Hello, FYI and to track the issue: In preparation for the Humble release I tried to build STVL under Humble beta with Ubuntu 22.04 and the blocking point seems to be, surprise, the version of the OpenVDB package released in Ubuntu. My quick analysis of the cause:

OpenVDB seems to already have a version (v8.2) compatible with TBB 2021.5 since november : https://github.com/AcademySoftwareFoundation/openvdb/releases/tag/v8.2.0. My guess is that releasing v8.2 instead of v8.1 to Ubuntu 22.04/jammy would solve the issue. I will try to file a bug report on launchpad hoping that it will be faster than https://bugs.launchpad.net/ubuntu/+source/openvdb/+bug/1882998 (remember https://github.com/SteveMacenski/spatio_temporal_voxel_layer/issues/167).

(Side note, looking at OpenVDB release page: https://github.com/AcademySoftwareFoundation/openvdb/releases, I noticed the release of v9.0 with

Official release of NanoVDB, which for the first time offers GPU support for static sparse volumes in OpenVDB.

It is GPU agnotic, more info: https://academysoftwarefoundation.github.io/openvdb/NanoVDB_FAQ.html @SteveMacenski, did you follow ? Maybe that's interesting to look at for STVL and voxels in nav2)

SteveMacenski commented 2 years ago

Thanks for bringing this up / investigating, this is very frustrating. I think you've taken all of the reasonable steps, let me know if there's anything I can do to help! This seems like a crazy oversight and I really hope its handled quickly. Who's deciding on these openVDB versions? Is the problem that it's being released by someone else hap-haphazardly?

I think there's nothing wrong with multiple released versions, so if 8.2 handles, we can target that and update the rosdistro keys to use that one explicitly. How is 8.1 built / released without TBB?

I have not followed the nanoVDB work. I'm happy to see some of that Nvidia GVDB work getting back into the main repository vs trying to consume openVDB with their own project. I would agree that sounds indeed interesting - would that contain some of these same issues as we see with OpenVDB / TBB on versioning? I haven't found a good list of "where" the GPU is used most other than raycasting, which we don't use here, but it claims random access is faster, which we use alot of. So either way, its a good bet.

I'd love to use NanoVDB as the basis of a new nav2 environmental model, but all these versioning issues are killing us and if this was in the core nav2 stack, that could be critical to release and put us back months on each OS release. The docs say NanoVDB has few external dependencies, is that true that it would solve this issue for us? If that is, and could stop future versioning issues, it might be a good idea to do it here as a trial and use our experience there to inform our future decisions. Having GPU enablable raycasting could really help save me a ton of time down the road on a voxel-grid replacement in Nav2.

nachovizzo commented 2 years ago

Hello there, just throwing in my 2 cents.

In case you are interested, I managed to build OpenVDB from the source within the project build. The build script is here. Feel free to copy-paste it, just acknowledge the source :) With this "superbuild" approach I managed to also build the application on different Ubuntu, Fedora, CentOS distributions, macOS, and so on.

I quickly tried to build vdbfusion on Ubuntu 22.04 and the OpenVDB build was working and linking correctly to the application. There was a new compiler error (GCC 11 on ubuntu 22.04) for the TBB build but I managed to disable it from the compiler flags.

I hope this helps, if you need extra support, or if you would like me to help migrate the build system just let me know.

NOTE: The drawback of this approach is that you need to build OpenVDB from source(slow), but using ccache basically turns this into a one-time problem and never more.

Best wishes!

doisyg commented 2 years ago

Thanks for bringing this up / investigating, this is very frustrating. I think you've taken all of the reasonable steps, let me know if there's anything I can do to help! This seems like a crazy oversight and I really hope its handled quickly. Who's deciding on these openVDB versions? Is the problem that it's being released by someone else hap-haphazardly?

I am not familiar with the Ubuntu release process so I don't really now, but it is certainly not triggered by the OpenVdb team.

I think there's nothing wrong with multiple released versions, so if 8.2 handles, we can target that and update the rosdistro keys to use that one explicitly. How is 8.1 built / released without TBB?

No idea... maybe I miss something obvious

I have not followed the nanoVDB work. I'm happy to see some of that Nvidia GVDB work getting back into the main repository vs trying to consume openVDB with their own project. I would agree that sounds indeed interesting - would that contain some of these same issues as we see with OpenVDB / TBB on versioning? I haven't found a good list of "where" the GPU is used most other than raycasting, which we don't use here, but it claims random access is faster, which we use alot of. So either way, its a good bet.

I'd love to use NanoVDB as the basis of a new nav2 environmental model, but all these versioning issues are killing us and if this was in the core nav2 stack, that could be critical to release and put us back months on each OS release. The docs say NanoVDB has few external dependencies, is that true that it would solve this issue for us? If that is, and could stop future versioning issues, it might be a good idea to do it here as a trial and use our experience there to inform our future decisions. Having GPU enablable raycasting could really help save me a ton of time down the road on a voxel-grid replacement in Nav2.

Well, I believe the versioning is properly done by the OpenVDB team: https://github.com/AcademySoftwareFoundation/openvdb/releases. Then it is just a matter of proper deb packaging. If we fear that the Ubuntu releases are going to be an issue, we can decide on doing our own ros openvdb release, for instance ros-humble-openvdb8.2and ros-humble-openvdb9.0 plus according rosdep keys.

And I would love to see the result in nav2!

doisyg commented 2 years ago

Hello there, just throwing in my 2 cents.

In case you are interested, I managed to build OpenVDB from the source within the project build. The build script is here. Feel free to copy-paste it, just acknowledge the source :) With this "superbuild" approach I managed to also build the application on different Ubuntu, Fedora, CentOS distributions, macOS, and so on.

I quickly tried to build vdbfusion on Ubuntu 22.04 and the OpenVDB build was working and linking correctly to the application. There was a new compiler error (GCC 11 on ubuntu 22.04) for the TBB build but I managed to disable it from the compiler flags.

I hope this helps, if you need extra support, or if you would like me to help migrate the build system just let me know.

NOTE: The drawback of this approach is that you need to build OpenVDB from source(slow), but using ccache basically turns this into a one-time problem and never more.

Best wishes!

Good to know that it is easy to build and link to from source, at least for trying it out. Can you confirm that you have an issue too building a software depending on OpenVDB on Ubuntu 22.04 with the Ubuntu provided package ?

nachovizzo commented 2 years ago

Hello there, just throwing in my 2 cents. In case you are interested, I managed to build OpenVDB from the source within the project build. The build script is here. Feel free to copy-paste it, just acknowledge the source :) With this "superbuild" approach I managed to also build the application on different Ubuntu, Fedora, CentOS distributions, macOS, and so on. I quickly tried to build vdbfusion on Ubuntu 22.04 and the OpenVDB build was working and linking correctly to the application. There was a new compiler error (GCC 11 on ubuntu 22.04) for the TBB build but I managed to disable it from the compiler flags. I hope this helps, if you need extra support, or if you would like me to help migrate the build system just let me know. NOTE: The drawback of this approach is that you need to build OpenVDB from source(slow), but using ccache basically turns this into a one-time problem and never more. Best wishes!

Good to know that it is easy to build and link to from source, at least for trying it out. Can you confirm that you have an issue too building a software depending on OpenVDB on Ubuntu 22.04 with the Ubuntu provided package ?

Not really, since I'm always building OpenVDB from source... to avoid exactly this sort of problems. I can't rely on Ubuntu and their extremely slow and complicated packaging process (I don't blame them though, it's a big jungle anyways). That's why I've decided to include it in my own build, I can control it and I know it's working (almost) everywhere. As an example, I'm using some docker images built on top of CentOS (manylinux) where I can't control which OpenVDB version will be packaged for that distro, therefore It made much more sense to just build it :)

If you think it will be valuable for the community I can simple open a PR here with the updated build system (shouldn't take long)

Let me know!

doisyg commented 2 years ago

All right, that leave us the following options:

A. Wait for a package fix in Ubuntu.

B. Do an OpenVDB ros release.

C. OpenVDB source build as part of the package that needs it (here STVL), i.e @nachovizzo solution

I am personally in favor of B.

If you think it will be valuable for the community I can simple open a PR here with the updated build system (shouldn't take long)

Thx, I ll take it and it will be helpful for testing STVL with Humble beta though I am not sure it is ideal in the long term

SteveMacenski commented 2 years ago

I think B is the most reasonable, but I would give A a few weeks / a month to see if we can't get an updated release out and have that just be part of the package manager without our doing. If we add our own buildfarm setup, we'll have to be owning that process and version controlling for ROS and having to explain to folks why we need it. The issue we bring up is in the same theme as the other openVDB issues we've had in the past that did eventually get resolved.

Want to start that process? It might be easier than we think if we have this versioning information / issues documented like above, seems like a no brainer to get v8.2 out -- maybe in addition to v8.1 if we can't replace

clalancette commented 2 years ago

I'm going to preface this by saying that this, unfortunately, is going to be tricky to fix. (as a side note, this is one of the reasons that we aggressively change Ubuntu distributions to the latest LTS as soon as packages are available; if you find these issues before the release, they are easier to fix in the distribution itself)

A. Wait for a package fix in Ubuntu.

This is, by far, the best option. If we can convince the Ubuntu maintainers to upgrade, or at least take a patch to fix this particular issue, that will fix it for this package and for the rest of the distribution. It will also make it so that binaries that link both against code here and against anything else that uses OpenVDB will continue to work without ABI issues.

B. Do an OpenVDB ros release.

This can be done, but it has to be done extremely carefully. In particular, you need to make sure that nothing else in the system will accidentally find this package instead of the system one (as that can lead to the ABI issues I mentioned above). For instance, when we vendor OGRE for Rviz, we install it to /opt/ros/rolling/opt/rviz_ogre_vendor , which is outside of the normal hierarchy, and won't be found by CMake or pkg-config by default. Then you need to install an "extras" file so that that path is exposed to things looking for the vendor package. And finally, you need to be careful to do find_package(rviz_ogre_vendor) as the very first thing in the CMakeLists.txt, to make sure nothing else finds the underlying package first.

That makes it work, but it is still not great. If a third-party package wants to depend both on something that uses libogre-dev (from the system), and something that uses rviz_ogre_vendor, they can't. Actually, it is worse than that; it will all compile just fine, and randomly crash because of ABI issues.

So in short, I'd say this is a last-ditch effort, because it is tricky to setup properly and tricky for downstream users to use properly.

C. OpenVDB source build as part of the package that needs it (here STVL), i.e @nachovizzo solution

If you can't do A, I'll suggest this one. However, you also need to be careful here. The absolute best way to do this is to build OpenVDB as a static library, and then link it into your main executable with private symbols. If you do that, then anything downstream that wants to depend on both your library and something else that uses OpenVDB should work.

doisyg commented 2 years ago

Thanks for the clarification Chris! Well then I guess we should push for A.

I investigated a bit more and managed to build STVL (with few code adaptation https://github.com/SteveMacenski/spatio_temporal_voxel_layer/pull/233) by uninstalling libopenvdb-dev and building/installing OpenVDB v8.2 (also tried successfully v9.0) from source with these instructions https://github.com/AcademySoftwareFoundation/openvdb#building-openvdb This tend to confirm my quick diagnostic of an incompatibility between OpenVDB v8.1 and TBB 2021.5 due to deprecated headers (<tbb/task_scheduler_init.h>) OpenVDB v8.1 to v8.2 changes were mainly about solving this issue I believe.

Launchpad bug report is here: https://bugs.launchpad.net/ubuntu/+source/openvdb/+bug/1970108 I guess it will need confirmation from other users to start to be considered

xouillet commented 2 years ago

In the meantime waiting for a proper integration by Ubuntu packaging team, I've ported debian packaging on the v8.2.0 tag of source and built .deb. They are delivered without any warranty but we are currently using it without issue on our dev branches.

You can find them here: https://github.com/wyca-robotics/openvdb/releases/tag/v8.2.0-debian

Myzhar commented 2 years ago

@SteveMacenski @clalancette is there an official update on this? We tested STVL on Foxy and it's really promising, so we would like to use it on Humble too. @xouillet solution is good, but it's not good to suggest users install "unofficial" packages to make it work

doisyg commented 2 years ago

The resolution is in Ubuntu's hand. You can put pressure by confirmed the issue on this ticket: https://bugs.launchpad.net/ubuntu/+source/openvdb/+bug/1970108

tonynajjar commented 2 years ago

In the meantime waiting for a proper integration by Ubuntu packaging team, I've ported debian packaging on the v8.2.0 tag of source and built .deb. They are delivered without any warranty but we are currently using it without issue on our dev branches.

You can find them here: https://github.com/wyca-robotics/openvdb/releases/tag/v8.2.0-debian

Thank you @xouillet for the current workaround, I can confirm that STVL builds and works after installing these debians.

doisyg commented 2 years ago

An alternative idea: replacing OpenVDB with Bonxai https://github.com/facontidavide/Bonxai . But for this we would need a bit of visibility on Bonxai maintenance and maturity. What do you think @facontidavide ?

SteveMacenski commented 2 years ago

https://github.com/facontidavide/Bonxai/issues/1 We've had this discussion - it doesn't seem to be suitable given Davide's interest in it as more an exercise than something to support long-term.

facontidavide commented 2 years ago

Said that... Bonxai just works. Even without further development, it might be sufficient for your application

facontidavide commented 2 years ago

I really think you should try bonxai, if I remember what you do, you already have everything you need

SteveMacenski commented 2 years ago

Definitely not a judgement on valuable proposition :smile: Its more a matter of risk, if you're yourself not a user of it and not going to maintain it long term, that introduces a bit more there. If you were using it actively in another project (or another major project was using it and could give the thumbs up) or committing to maintaining it for the foreseeable future, that'd be a different story. Since STVL is a relatively "historical" package, I'm a bit risk averse to major backbone changes since I'm not currently in a position to do extensive hardware testing to make sure this still works for the spanning set of user applications. I simply don't have the time or the hardware to make that kind of change comfortably or to contribute to Bonxai should there be issues or future features needed. This is very much a hardware-first driven idea and I don't have robots right now with a satisfactory number of sensors for me to test and feel totally comfortable that I'm not going to break people processing 10+ RGBD sensors (e.g. scales) for long-duration uses (e.g. verify no memory leaks or growth).

OpenVDB works well enough (when you can install the damn thing). Its definitely not ideal, but its the beast we know after you get the install complete. Its my plan in the next ~2 years to fully revamp costmaps and this work so in the medium-term future this project may be deprecated by being ingested into the main navigation stack itself in a new form.

With that said, I wouldn't block the use of it if someone opened a PR and was happy with it after some relatively extensive testing with a number of sensors. I don't know how I would deal with it at that time off the top of my head, but I'd find a compromise that works for everyone (maybe a bonxai branch of stvl released under a slightly different name & removing OpenVDB's version after enough users give the +1).

facontidavide commented 2 years ago

Perfectly valid reasoning ๐Ÿ˜Š The last thing I want is to unintentionally insert a regression in your software ๐Ÿ˜…

SteveMacenski commented 2 years ago

If I had 2 months I could make this my primary project, 10/10 I would rip out OpenVDB from this + vet Bonxai so we can be forever done with it. However even if I was time rich, I'm still 10+ sensor robot poor. The systems that motivated this work I left when I left Simbe.

But, temporal-ness is a key requirement I'm establishing from the on-set when I do my environmental modeling kick. Its unclear if I'll get to that in 2023 or 2024, but its high up on the list next to localization and my current project in trajectory planning. Never a boring year...

doisyg commented 2 years ago

Thank you both for your comments. I am not using any RGB-D camera at the moment and hence I am only following from far the maintenance of STVL. But I may have to jump back early 2023, and if by that time OpenVDB is still not fixed in Ubuntu 22.04, I may just do that: create a Bonxai based branch of STVL and test its robustness on a continuously running robot. If I have good results I will happily share them and then we can debate if this is a desirable direction. I feel committed to help maintaining this beautiful package with which I had so good results in the past.

SteveMacenski commented 2 years ago

Agreed, building OpenVDB isn't too bad but yes, it is rather annoying that we cannot release STVL binaries for users to install more naturally. Your release though makes that easier

agoeckner commented 1 year ago

It seems that there hasn't been any progress on this. Anything that I can do to move it along?

SteveMacenski commented 1 year ago

@doisyg I believe was making some headway but I know he's busy right now so it might be some time before he responds

nachovizzo commented 1 year ago

Sorry for adding noise once more, but I still can provide support for managing OpenVDB as a static library. I think it was option C while ago.

While this will impact build times + binary size... It can help getting rid of annoying packaging problems from Debian/Ubuntu.

Your call @SteveMacenski. Even if you need a draft I'm happy to do it.

Best!

SteveMacenski commented 1 year ago

Can you clarify what you mean / intend to do? A static .so file probably isn't going to be architecture indepedent, but a submodule to a version of openvdb that works could be a good stop gap

clalancette commented 1 year ago

Can you clarify what you mean / intend to do? A static .so file probably isn't going to be architecture indepedent, but a submodule to a version of openvdb that works could be a good stop gap

Yes, to be clear, my option C above wasn't "commit a binary .so file to the package". It was "set things up so that during compilation of this package, you compile OpenVDB as a .a and link it in".

agoeckner commented 1 year ago

Can you clarify what you mean / intend to do? A static .so file probably isn't going to be architecture indepedent, but a submodule to a version of openvdb that works could be a good stop gap

Yes, to be clear, my option C above wasn't "commit a binary .so file to the package". It was "set things up so that during compilation of this package, you compile OpenVDB as a .a and link it in".

That certainly sounds like a good option that would allow for STVL binaries to be created & released. I'm all for it!

nachovizzo commented 1 year ago

Hello there, just throwing in my 2 cents.

In case you are interested, I managed to build OpenVDB from the source within the project build. The build script is here. Feel free to copy-paste it, just acknowledge the source :) With this "superbuild" approach I managed to also build the application on different Ubuntu, Fedora, CentOS distributions, macOS, and so on.

I quickly tried to build vdbfusion on Ubuntu 22.04 and the OpenVDB build was working and linking correctly to the application. There was a new compiler error (GCC 11 on ubuntu 22.04) for the TBB build but I managed to disable it from the compiler flags.

I hope this helps, if you need extra support, or if you would like me to help migrate the build system just let me know.

NOTE: The drawback of this approach is that you need to build OpenVDB from source(slow), but using ccache basically turns this into a one-time problem and never more.

Best wishes!

@SteveMacenski as mentioned here the idea is to solve the problem via cmake.

I've done this quote some time ago and has been working so far, check vdbfusion as an example. Another similar one would be KISS-ICP

SteveMacenski commented 1 year ago

Apologies, I was a bit quick on the draw there and re-read the thread to familiarize myself. I think Option C is the best at this point as well. It puts some additional compiling time on the users, but hopefully its a 1-and-done kind of thing after its cached in the workspace.

I'd definitely support adding that into STVL so we can get binaries to turn over as well potentially again. It will have very long build times in the farm, but it should be finally available again.

nachovizzo commented 1 year ago

Great, will prepare a pull request in the following weeks/days

doisyg commented 1 year ago

As it has been more than a year since I reported the bug to Ubuntu and no action was taken to re-release OpenVdb to Jammy, I guess it makes senses to try option C if that's not too much effort. Knowing there is always the workaround of compiling from source with the fixed deb build by @xouillet https://github.com/wyca-robotics/openvdb/releases/tag/v8.2.0-debian

nachovizzo commented 1 year ago

@SteveMacenski before I start this effort. Is there any build farm I should be looking to? What about ROS-distro? I will first make it work on docker containers, but I'd like to complete the full effort, not just half of it ;)

I haven't used this package not even once in my life, so any directions on how to make sure I do integration from OpenVDB properly are more than appreciated!

agoeckner commented 1 year ago

In my Dockerfile, I'm currently doing this:

FROM ros:humble

# Build and install patched version of OpenVDB (see https://github.com/SteveMacenski/spatio_temporal_voxel_layer/issues/232).
RUN apt-get remove libopenvdb*; apt-get update && apt-get install -y libboost-system-dev libboost-iostreams-dev libtbb-dev libblosc-dev; \
    git clone --recurse --branch v8.2.0-debian https://github.com/wyca-robotics/openvdb.git /opt/openvdb && \
    mkdir /opt/openvdb/build && cd /opt/openvdb/build && \
    cmake .. && \
    make -j1 && make install && \
    cd ..; rm -rf /opt/openvdb/build

# Perform ROS dependency installation for our workspace.
ADD ./src /opt/robot/src
RUN rosdep update && rosdep install --from-paths /opt/robot/src --ignore-src -r -y

# Build workspace.
ENV ROBOT_WS /opt/robot
RUN cd "$ROBOT_WS" && . /opt/ros/humble/setup.sh && colcon build

Obviously, this takes forever.

SteveMacenski commented 1 year ago

Good question - @clalancette I know git submodules don't clone in rosdistro for packaging, but is there another way we can ingest a desire to build OpenVDB next to STVL in the buildfarm to enable binary install? For instance, invoking the download of the source / build via cmake? I know that's probably not ideal, but I think we're currently picking the least sticky option :hankey:

That's basically the same as releasing to rosdistro OpenVDB in terms of compute time used, but at least isolates it if we statically link to STVL. The git submodule or bash script doing what @agoeckner shows in docker would work, but not allow us to release binaries. Which of course is still better than the status quo.

clalancette commented 1 year ago

Good question - @clalancette I know git submodules don't clone in rosdistro for packaging, but is there another way we can ingest a desire to build OpenVDB next to STVL in the buildfarm to enable binary install? For instance, invoking the download of the source / build via cmake? I know that's probably not ideal, but I think we're currently picking the least sticky option hankey

You should be able to use the FetchContent functionality of cmake to do something like this. You can see how we use it a bit in the core in e.g. https://github.com/ros2/rosbag2/blob/rolling/mcap_vendor/CMakeLists.txt

SteveMacenski commented 1 year ago

@agoeckner that seems like a good option with a reference :smile:

agoeckner commented 1 year ago

@agoeckner that seems like a good option with a reference ๐Ÿ˜„

It certainly is enough for me to build for our lab. How would we make this work with the ROS build farm though?

SteveMacenski commented 1 year ago

I believe fetch content should work in the build farm, it has a network connection

nachovizzo commented 1 year ago

@SteveMacenski @doisyg Sorry for being late to the party.

I've created a repo where I'm testing a fork where I conditionally build OpenVDB from the source. It takes a bit more time, but at least succeeds in building. The repo is here the fork is there

If the builds suceed on the CI I will open a PR soon.

agoeckner commented 12 months ago

Hey @nachovizzo, did your build succeed?

edit: Nevermind, I see PR #267.

tonynajjar commented 8 months ago

Cool, I was able to build successfully, does this also mean that we can have apt-installable binaries for humble and iron? (sorry if it was mentioned somewhere, didn't see it)

SteveMacenski commented 8 months ago

It does open that door now!

tonynajjar commented 8 months ago

Nice, who's going to do us the honor and save us 5 mins of building time in our CI? ๐Ÿ˜ฌ (I'm not sure what needs to be done otherwise would have volunteered)

SteveMacenski commented 8 months ago

Tomorrow's release day :-)

SteveMacenski commented 8 months ago

Actually, I'm blocked by the rolling transient issues right now so I can't run a release. But, Humble sync was today, so its at least a month before another one anyway so its no different than if I were to do it today vs 2 weeks from now. I've rescheduled it for 2 weeks from now to hopefully have all the 24.04 and rosdep stuff ironed out

agoeckner commented 8 months ago

Where is the next Humble sync date listed?

Timple commented 8 months ago

An iron release would also be appreciated (just so it's not forgotten ๐Ÿ™‚ )

agoeckner commented 7 months ago

Anything I can do to help get these packages out the door?

SteveMacenski commented 7 months ago

I'll do it today, syncs are in a weird state due to Rolling on 24.04 and some issue there and I just haven't been able to get to everything I would have liked to & many things generally blocked. I had tried to release it but I wanted to build locally in a clean container first and it was crashing on me from OpenVDB so it got shelved since I didn't have time to look into it. I can try today however

If I run into any major issues again with the build farm on the vendor package, I could use some help from those that want to use it (as I'm not really using this on a day to day at the moment & my main priority is Nav2)