Open diegoferigo opened 3 years ago
I am giving conda a try. I did the following in a plain ubuntu:focal
container:
mesa-libgl-devel-cos7-x86_64 libglu libxrandr-devel-cos7-x86_64 xorg-libxrandr
tensorflow-gpu
After these steps, I hoped that the tensorflow problems related to protobuf would be solved. Sadly, they are not. If tensorflow is imported after scenario, tensorflow will segfault.
The problem is the libtensorflow_framework.so.2
library that can be found in the tensorflow
python package. Loading only that library with ctypes("/path/to/libtensorflow_framework.so.2")
results to the segfault.
At this point, I tried to avoid altering the dlopen flags when loading scenario.bindings.gazebo
(done here) and the import situation seems solved. However, we need this tweak to make singletons works (unless https://github.com/ignitionrobotics/ign-gazebo/issues/248 get implemented somehow), so this is a solution we cannot use. Even though the dlopen flags are restored after the import, tensorflow doesn't like it.
I think at this point the best workaround that fixes both conda and non-conda environments is detecting if tensorflow is installed and, if found, open the problematic shared library, similarly to https://github.com/apache/arrow/pull/2210. This should be done by scenario
, and it would result to tensorflow loaded even if the application does not use it. https://github.com/tensorflow/tensorflow/issues/2903 is another related issue.
cc @traversaro
Install ogre 1.9 from sources installing into the conda environment (the ogre package seems too old in conda)
Strange, conda-forge should contain ogre version that works fine with ignition-rendering, see https://github.com/conda-forge/staged-recipes/pull/13677 .
Install
tensorflow-gpu
How did you install tensorflow-gpu
? If via the wheel installed by pip, then the problem with the vendored protobuf I guess should be the same that you have when installing gym-ignition dependencies via apt. The protobuf
problem would be avoided it a tensorflow-gpu build (that used the conda-forge protobuf) was available in conda-forge, but unfortunatly tensorflow for several reasons is quite complicated to package (see https://github.com/conda-forge/tensorflow-feedstock/pull/110, https://github.com/conda-forge/tensorflow-feedstock/issues/107 and https://github.com/conda-forge/tensorflow-feedstock/issues/10).
Install ogre 1.9 from sources installing into the conda environment (the ogre package seems too old in conda)
Strange, conda-forge should contain ogre version that works fine with ignition-rendering, see conda-forge/staged-recipes#13677 .
That is what I thought. Maybe there was something strange in my setup (even though it is a plain docker image), but I got compilation errors. I didn't save the log, my fault, but I remember some failed checks to nullptr
of ogre pointers. I didn't investigate much, I thought it was just an outdated version of ogre, and moved towards a source compilation of the 1.9 version, ignition rendering compiled successfully.
Install
tensorflow-gpu
How did you install
tensorflow-gpu
? If via the wheel installed by pip, then the problem with the vendored protobuf I guess should be the same that you have when installing gym-ignition dependencies via apt. Theprotobuf
problem would be avoided it a tensorflow-gpu build (that used the conda-forge protobuf) was available in conda-forge, but unfortunatly tensorflow for several reasons is quite complicated to package (see conda-forge/tensorflow-feedstock#110, conda-forge/tensorflow-feedstock#107 and conda-forge/tensorflow-feedstock#10).
I installed the tensorflow-gpu
conda package (not from the conda-forge channel). I followed one of the issues you posted (particularly https://github.com/conda-forge/tensorflow-feedstock/issues/10) where people reported that the package from the default channel was working fine. I tried on a fresh container and indeed I've got tf with GPU support.
Now that you make me think about it, I originally thought that this tensorflow
package was built against the conda-forge's protobuf, but actually since it's not coming from the conda-forge channel maybe there's still a mismatch. However, I also did try to dig into the various .so
files of the tensorflow package but I didn't yet find the one that links against protobuf using ldd
.
I installed the
tensorflow-gpu
conda package (not from the conda-forge channel). I followed one of the issues you posted (particularly conda-forge/tensorflow-feedstock#10) where people reported that the package from the default channel was working fine. I tried on a fresh container and indeed I've got tf with GPU support.
Interesting. I think that the tensorflow from the defaults channel is build from the recipes in https://github.com/AnacondaRecipes/tensorflow_recipes . I think it is probably building against the defaults channel protobuf, built from https://github.com/AnacondaRecipes/protobuf-feedstock . However, to double check it would be interesting to understand which packages are actually installed in your environment with conda list
.
I didn't reach yet any conclusion, let's see when I find some more time to continue this experiment. Here below the current list of packages. I have some cos6 and cos7 duplicates, but I guess I'm using the cos7.
In any case, so far this was a kind of successful experiment. Despite the few hacks, I managed to compile:
I also need ray and rllib, however I don't think it would be a problem. In earlier experiments I tried to install it from conda-forge together with tensorflow from conda's default channel and they seemed working fine.
I'm trying to prepare a docker image containing what I've got so far, but now I'm stuck with the following error. Comparing the conda packages inside the docker image and what posted in the comment above did not show anything relevant.
This is happening with ogre from conda-forge. I tried compiling ogre from source and install into the conda environment as I did last time but I get the same error.
Weird, it seems that commenting out the include of /conda/x86_64-conda-linux-gnu/sysroot/usr/include/GL/glx.h:333
solves the problem. The compilation succeeds. Here below the snippet:
#ifndef GLX_GLXEXT_LEGACY
#include <GL/glxext.h> /* This is line 333 */
#endif /* GLX_GLXEXT_LEGACY */
What I don't yet understand is what could have defined GLX_GLXEXT_LEGACY
in my previous setup (I proceeded step-by-step by hand in a clean docker container) wrt my current Dockerfile.
Related discussion in the ignition-rendering recipe PR : https://github.com/conda-forge/staged-recipes/pull/13677#issuecomment-760006547 .
Thanks for the link, the problem is the same. I didn't get if a solution was found.
In the meantime, I paste here the Dockerfile I got. I'll commit it somewhere, not sure yet where.
Thanks for the link, the problem is the same. I didn't get if a solution was found.
No, it was just passed the -DGLX_GLXEXT_LEGACY
flag to the compiler: https://github.com/conda-forge/libignition-rendering4-feedstock/blob/6f273d6430be1812e2037bb6e976b36aefd5f9ca/recipe/build.sh#L6 .
I managed to reproduce the ogre compilation error I reported above in https://github.com/robotology/gym-ignition/issues/279#issuecomment-774730540:
This error occurs when ogre==1.10.12
is installed in the conda workspace. Updating ogre to 1.12.10
(the only other available version in conda-forge) does not create problems while building ign-rendering4
, but I get the runtime error reported in https://github.com/osrf/gazebo/issues/2700 when opening the gui installed by igni-gui4
(from colcon).
The only alternative I didn't yet check is compiling and installing ogre 1.9.1 from sources. This should match the same configuration with ogre1 of the ignition packages provided in the osrf's ppa.
Updating ogre to
1.12.10
(the only other available version in conda-forge) does not create problems while buildingign-rendering4
,
Ok, this make sensei as that ogre is the one used in the ignition-rendering4 official conda build.
but I get the error reported in osrf/gazebo#2700 when opening the gui installed by igni-gui4 (from colcon).
Did you tried to add ${CONDA_PREFIX}/Media
or ${CONDA_PREFIX}/Media/ShadowVolume
to OGRE_RESOURCE_PATH
(see https://github.com/ignitionrobotics/ign-rendering/blob/a5f00f91f507cf9cc14134f2fca0f03c4a99a7cc/ogre/src/OgreRenderEngine.cc#L83)? If that fix is the only necessary, probably we can just add an activation script to ign-rendering4
feedstock.
This error occurs when
ogre==1.10.12
is installed in the conda workspace.
I checked a few of the compilation errors, and they seems all to be at points where there is an ifdef for compatibility between 1.9 and 1.12 , such as https://github.com/ignitionrobotics/ign-rendering/blob/ign-rendering4/ogre/src/OgreMaterialSwitcher.cc#L82 . Probably the ifdef logic needs to be change to correctly account for Ogre 1.10.12 .
I guess this discussion could be relevant/interesting for @JShep1 @wolfv @Tobias-Fischer .
Updating ogre to
1.12.10
(the only other available version in conda-forge) does not create problems while buildingign-rendering4
,Ok, this make sensei as that ogre is the one used in the ignition-rendering4 official conda build.
Yep. However as reported above, this ogre version has runtime errors when opening Ignition Gazebo's GUI.
but I get the error reported in osrf/gazebo#2700 when opening the gui installed by igni-gui4 (from colcon).
Did you tried to add
${CONDA_PREFIX}/Media
or${CONDA_PREFIX}/Media/ShadowVolume
toOGRE_RESOURCE_PATH
(see https://github.com/ignitionrobotics/ign-rendering/blob/a5f00f91f507cf9cc14134f2fca0f03c4a99a7cc/ogre/src/OgreRenderEngine.cc#L83)? If that fix is the only necessary, probably we can just add an activation script toign-rendering4
feedstock.
Thanks for the hint, I just tried to compile again ign-rendering4 from colcon against conda-forge's ogre==1.12.10
. The folder you reported is in /conda/share/OGRE/Media
. After exporting the environment variable and running the application that opens the Ignition GUI, I get the following error:
[GUI] [Err] [OgreRenderEngine.cc:465] Unable to load Ogre Plugin[/conda/share/OGRE/Media/RenderSystem_GL]. Rendering will not be possible.Make sure you have installed OGRE properly.
terminate called after throwing an instance of 'Ogre::RuntimeAssertionException'
what(): RuntimeAssertionException: Ogre/ShadowExtrudePointLight not found. Verify that you referenced the 'ShadowVolume' folder in your resources.cfg in initialise at /home/conda/feedstock_root/build_artifacts/ogre_1611842210433/work/OgreMain/src/OgreShadowVolumeExtrudeProgram.cpp (line 70)
Aborted (core dumped)
The first line is Ignition Rendering 4 that complains about the RenderSystem_GL
folder that does not exist (I guess there is a GL feature in ogre that is not enabled in the package from conda-forge). The second line is an Ogre exception.
This error occurs when
ogre==1.10.12
is installed in the conda workspace.I checked a few of the compilation errors, and they seems all to be at points where there is an ifdef for compatibility between 1.9 and 1.12 , such as https://github.com/ignitionrobotics/ign-rendering/blob/ign-rendering4/ogre/src/OgreMaterialSwitcher.cc#L82 . Probably the ifdef logic needs to be change to correctly account for Ogre 1.10.12 .
I might try to alter the compiler definition by hand, let's see if I manage to get something out of this.
I might try to alter the compiler definition by hand, let's see if I manage to get something out of this.
The related CMake code is https://github.com/ignitionrobotics/ign-rendering/blob/ign-rendering4/ogre/src/CMakeLists.txt#L18 .
Hi @diegoferigo, could you please report on the precise version of ogre that is installed in your conda environment? We fixed the bug about the missing RenderSystem_GL
one year ago (see https://github.com/conda-forge/ogre-feedstock/issues/15), so I fear for some reason an outdated version is installed.
The first line is Ignition Rendering 4 that complains about the RenderSystem_GL folder that does not exist (I guess there is a GL feature in ogre that is not enabled in the package from conda-forge). The second line is an Ogre exception.
No, actually I think I misunderstood the role of OGRE_RESOURCE_PATH
. OGRE_RESOURCE_PATH
apparently is used to search for OGRE plugins, and those are in ${CONDA_PREFIX}/lib/OGRE
, while you can see that is searching for the plugin in the wrong directory in the error message. We should check if there is any other env variable to inject directory in the ogre resource locator, similar to what GAZEBO_RESOURCE_PATH
was in Classic Gazebo (see https://github.com/osrf/gazebo/issues/2700#issuecomment-616319253).
Hi @diegoferigo, could you please report on the precise version of ogre that is installed in your conda environment? We fixed the bug about the missing
RenderSystem_GL
one year ago (see conda-forge/ogre-feedstock#15), so I fear for some reason an outdated version is installed.
For this test I was using ogre==1.12.10
but, as @traversaro wrote, it was caused by a wrong usage of OGRE_RESOURCE_PATH
. I see RenderSystem_GL*.so
files in /conda/lib/OGRE
, therefore the package seems ok.
I might try to alter the compiler definition by hand, let's see if I manage to get something out of this.
The related CMake code is https://github.com/ignitionrobotics/ign-rendering/blob/ign-rendering4/ogre/src/CMakeLists.txt#L18
Interestingly, using ogre==1.10.12
and patching ign-rendering4 as follows, I get a working ogre1-based rendering stack :tada:
sed -i "s|if(OGRE_VERSION VERSION_LESS 1.10.3)|if(OGRE_VERSION VERSION_LESS 1.11.0)|g" src/ign-rendering/ogre/src/CMakeLists.txt
sed -i "s|if(OGRE_VERSION VERSION_LESS 1.10.1)|if(OGRE_VERSION VERSION_LESS 1.11.0)|g" src/ign-rendering/ogre/src/CMakeLists.txt
It seems that the patch was applied also to the conda-forge ign-rendering recipe https://github.com/conda-forge/libignition-rendering4-feedstock/pull/8.
Using conda and particularly the
conda-forge
channel could introduce many benefits to our project:About 1), for the sake of testing, manually compiling ign-gazebo with colcon and using the conda environment for the dependencies should suffice.
On conda-forge there are already many of the ign-gazebo dependencies, but few of them are still missing. As soon as the entire ignition distribution will be included in the channel, we'll be able to bypass the manual installation of ign-gazebo from Open Robotics' PPA.
Edit: some preliminary conda instructions have been recently added upstream: https://github.com/ignitionrobotics/docs/blob/master/citadel/install_windows_src.md. The process is for windows, but it can used also on other OSs.