mvukov / rules_ros2

Build ROS 2 with Bazel
Apache License 2.0
85 stars 47 forks source link

Make image_transport_plugins cpp lib visible #76

Closed lalten closed 1 year ago

lalten commented 1 year ago

Maybe I don't quite understand how the ros2_plugin rule is supposed to be used?

I was able to get my node running with the changes in this PR, and then adding @ros2_image_common//:cpp_image_transport_plugins as deps entry to my ros2_cpp_binary.

I also tried to add @ros2_image_common//:image_transport_plugins as a data dep, but that did not help and I continued to get the error below.

[ERROR] [1678215815.321719302] [VideoDecoderNode]: Failed to load plugin image_transport/raw_pub, error string: MultiLibraryClassLoader: Could not create object of class type image_transport::RawPublisher as no factory exists for it. Make sure that the library exists and was explicitly loaded through MultiLibraryClassLoader::loadLibrary()

Buildifier also removed some load statements and I simplified the rest, I hope you don't mind.

mvukov commented 1 year ago

If your C++ target just depends on "@ros2_image_common//:image_transport",, you're good to go (see e.g. ros2/test/image_common). Your binary/test need to have ament setup working: ros2_test, ros2_launch and ros2_cpp_test ( with set_up_ament = True,) do the setup. Just running bazel run //path/to:my_cc_target is not going to work.

lalten commented 1 year ago

I see! Well this is a node that is a nodes dependency of a ros2_test that I'm trying to port. How would I do that in this case? It seems being a dependency of a ros2_test isn't enough for my VideoDecoderNode to see its plugins, but it's entirely possible I've missed something.

lalten commented 1 year ago

On further inspection, I can see that the aspect does pick up the plugin from the transitive deps for the test target, and it also does set the AMENT_PREFIX_PATH env var. This var is also propagated and visible to the node that the test spawns. I don't know yet why pluginlib can't find the plugin still.

lalten commented 1 year ago

I digged deeper and it pluginlib does actually find the manifest for the plugin, but then fails to load it in https://github.com/ros/pluginlib/blob/5.1.0/pluginlib/include/pluginlib/class_loader_imp.hpp#L177..L182

[VideoDecoderNode-4] [ERROR] [1678227008.705531399] [pluginlib.ClassLoader]: Checking path path/to/pkg/ros2_test_target_testcase_launch_ament_setup/lib/libimage_transport_plugins.so                                                                                                                                                                                 
[VideoDecoderNode-4] [ERROR] [1678227008.705557859] [pluginlib.ClassLoader]: Library image_transport_plugins found at explicit path path/to/pkg/ros2_test_target_launch_ament_setup/lib/libimage_transport_plugins.so.                                                                                                                                       
[VideoDecoderNode-4] [ERROR] [1678227008.725595444] [pluginlib.ClassLoader]: image_transport/raw_pub maps to real class type image_transport::RawPublisher                             
[VideoDecoderNode-4] [ERROR] [1678227008.725669764] [pluginlib.ClassLoader]: Exception raised by low-level multi-library class loader when attempting to create instance of class image_transport/raw_pub.                                                                                                                                                                    
[VideoDecoderNode-4] [ERROR] [1678227008.725680665] [nodename]: Failed to load plugin image_transport/raw_pub, error string: MultiLibraryClassLoader: Could not create object of class type image_transport::RawPublisher as no factory exists for it. Make sure that the library exists and was explicitly loaded through MultiLibraryClassLoader::loadLibrary()   

(Don't mind the ERRORs, these are DEBUGs but I couldn't figure out how to set the log level, so I patched them to be ERROR level logs instead)

I'm not sure where to continue. Any idea what's breaking here?

mvukov commented 1 year ago

It would be good if you can put together a small example repo that I can try out to reproduce the problem -- would also be good to have a .bazelrc file with options similar to the ones you use.

@ahans told me recently that the plugin system doesn't work with a clang compiler. I tried this out and the problem is legit.

mvukov commented 1 year ago

The factory-related issue you printed out indicates that the linker removed a static object responsible for plugin registration.

lalten commented 1 year ago

I do use clang, so there might be something about that. I will try to come up with a MRE.

I also checked the .so that it finds and it seems to contain the right thing (I don't see a factory method, but the class itself is certainly there) ❯ nm -C bazel-bin/path/to/pkg/ros2_test_target_launch_ament_setup/lib/libimage_transport_plugins.so | less

...
0000000000a282f0 W image_transport::RawPublisher::RawPublisher()
...
ahans commented 1 year ago

@lalten I tried a full build with clang and the plugin-related tests failed with similar error messages to what you posted above. Would probably be worth to create an issue for this. Clang-based build is certainly broken. My impression was that it has to do with symbols being removed or at least not having the correct visibility. Easy short term fix is to use gcc. ;-)

lalten commented 1 year ago

I currently have only a clang build available to me :scream: Getting a GCC toolchain stood up for the entire repo is a little out of scope right now :joy: I'll build the MRE, but how do you feel about landing this PR? I understand its not the right :tm: solution, but it does work in my clang build. Once we figure out what's causing the breakage we can revert this workaround and replace with a proper fix.

lalten commented 1 year ago

OK I have a MRE: bazel test //... in this repo :joy:

Clang is used by default on my machine, but you may need to add to the .bazelrc:

build --repo_env=CC=clang
build --repo_env=CXX=clang++

I'll create an issue for this.

lalten commented 1 year ago

--> https://github.com/mvukov/rules_ros2/issues/78

ahans commented 1 year ago

Depending explicitly on stuff via deps and having Bazel link everything together at build time is IMHO better than relying on some mechanism like ament to find dependencies at runtime. So for rules_ros2 this could actually be the right :tm: solution. But it certainly deviates from what stock ROS does. @mvukov must decide what to do with this PR.

mvukov commented 1 year ago

I'd like we come up with a solution that will unblock you -- it can be a tmp band-aid, why not. Figuring out what clang lld linker does differently might take some time.

I can't reproduce locally the situation that works for you (I used gcc btw): the one where you link against the plugin lib. Can you create a branch or a small repo to demonstrate this solution? Please update .bazelrc as well. Thanks in advance.

lalten commented 1 year ago

Can you create a branch or a small repo to demonstrate this solution? Please update .bazelrc as well. Thanks in advance.

https://github.com/mvukov/rules_ros2/pull/83 Sorry, I was too fast and clicked "create PR", and now it's upstream in this repo :facepalm: Anyways, the individual commits tell the story. I was able to fix the image_transport_plugins_tests for the clang build using the workaround.

LiyouZhou commented 1 year ago

Can confirm #83 also solved my problem here https://github.com/mvukov/rules_ros2/discussions/75, is it because for some reason the plugins are not being automatically built? and they have to be listed explicitly as dependencies?

mvukov commented 1 year ago

is it because for some reason the plugins are not being automatically built?

Nope, all relevant targets are built. The problems occur because of different linkers involved, gcc+gold vs clang+lld. gold-linked binaries work, we need a fix for clang+lld.

mvukov commented 1 year ago

@lalten No worries and thanks for the PR. I'll take a look at it soon.

mvukov commented 1 year ago

Closing in favor of #86.