Closed PatJRobinson closed 1 year ago
Happened for me as well.
Seems this was merged after I forked Groot in https://github.com/BehaviorTree/Groot/pull/123, but the PR shows it as a warning, not an error. Maybe something changed recently in the compiler flags, specifically -Wsign-compare
?
I just updated my fork to include that update and it seems to build fine. You should rebuild in a way that the cache will clone the Groot repo again (or just do a clean rebuild), and it should work.
Thanks for reporting these issues, BTW!
Actually, that's not it. I think groot build is broken right now on ROS2 Galactic, but it seems to work on Humble. That's all the time I have to look into it for now, unfortunately.
I've been meaning to do this upgrade to Humble anyway, so I will probably do that.
No problem, thanks for putting this out there!
Apologies if this is now off-topic, but I've been following your tutorial for NVIDIA + ros noetic (https://roboticseabass.com/2021/04/21/docker-and-ros/) and have managed to build the image.
I copied the Dockerfile:
FROM nvidia/cudagl:11.1.1-base-ubuntu20.04
# Minimal setup
RUN apt-get update \
&& apt-get install -y locales lsb-release
ARG DEBIAN_FRONTEND=noninteractive
RUN dpkg-reconfigure locales
# Install ROS Noetic
RUN sh -c 'echo "deb http://packages.ros.org/ros/ubuntu $(lsb_release -sc) main" > /etc/apt/sources.list.d/ros-latest.list'
RUN apt-key adv --keyserver 'hkp://keyserver.ubuntu.com:80' --recv-key C1CF6E31E6BADE8868B172B4F42ED6FBAB17C654
RUN apt-get update \
&& apt-get install -y --no-install-recommends ros-noetic-desktop-full
RUN apt-get install -y --no-install-recommends python3-rosdep
RUN rosdep init \
&& rosdep fix-permissions \
&& rosdep update
RUN echo "source /opt/ros/noetic/setup.bash" >> ~/.bashrc
Then did:
# Build the Dockerfile
docker build -t nvidia_ros .
# Start a terminal
docker run -it --net=host --gpus all \
--env="NVIDIA_DRIVER_CAPABILITIES=all" \
--env="DISPLAY" \
--env="QT_X11_NO_MITSHM=1" \
--volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" \
nvidia_ros \
bash
So far so good. But when I run gazebo, I get
libGL error: MESA-LOADER: failed to retrieve device information
If I do
export DISPLAY=:0
xhost +
I then get
Invalid MIT-MAGIC-COOKIE-1 keyxhost: unable to open display ":0"
I am a bit confused about how to fix this issue, or even what I am trying to achieve. I believe I am forwarding a GUI session to the container? If you have encountered this issue, and there's a simple fix, I'd greatly appreciate any insights you might have. I will continue looking online for a solution.
I then get
Invalid MIT-MAGIC-COOKIE-1 keyxhost: unable to open display ":0"
Yeah, this stuff is annoying. Docker is great until you need display. Could you try changing the line in the docker run command to:
--env="DISPLAY=${DISPLAY}" \
Unfortunately that didn't work. I'm not really sure where to go from here; is my desktop environment misconfigured or something, or is it some NVIDIA issue?
When I first open a terminal and echo $DISPLAY
, I get :1
So, if I
export DISPLAY=:0
Then run the docker run command as you suggested:
docker run -it --net=host --gpus all \
--env="NVIDIA_DRIVER_CAPABILITIES=all" \
--env="DISPLAY=${DISPLAY}" \
--env="QT_X11_NO_MITSHM=1" \
--volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" \
nvidia_ros \
bash
And then, inside the container
gazebo
I get
No protocol specified
No protocol specified
No protocol specified
Again, if I try to run the container when display is set to :1, I get the
libGL error: MESA-LOADER: failed to retrieve device information
As I believe the actual display has not been forwarded to the container, so this library doesn't know where to send its output?
PS: As a side note, I am able to run rqt when the display is set to :1, although I still get the libGL error.
Did you install the NVIDIA container runtime from here? https://github.com/NVIDIA/nvidia-docker
Also, I just got a working version that uses Humble which should finish building fully. Let me know if you're able to try out this PR : https://github.com/sea-bass/turtlebot3_behavior_demos/pull/12
Yes I followed the installation guide linked from that repo.
Actually, I believe the nvidia part is working correctly. When I run nvidia-smi
I get:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:01:00.0 Off | N/A |
| N/A 33C P8 11W / N/A | 10MiB / 8192MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
My current needs are actually simpler than this project; all I need to do is run some cuda kernels inside the container, wrapped in a ROS1 node and publishing point clouds to be subscribed to from the host. To this end, I substituted the base nvidia image for:
nvidia/cudagl:11.4.2-devel
When I build the container, I can indeed compile some test cuda code, and run a simple ROS point cloud publisher and echo the topic on the host.
So its a success, for now, and all thanks to the this project you have kindly shared!
As for the display troubles, I think it's an issue with Xhost; and I can't remember but I may have done something in the past to mess up the configuration, following online suggestions without really knowing what I as doing (as always!). There is this suspect line in my bashrc:
if [ "$SUDO_USER" != "" ] && [ "$DISPLAY" != "" ]
then
export XAUTHORITY=$(grep "^${SUDO_USER}:" /etc/passwd | cut -d : -f 6)/.Xauthority
fi
Which I have now commented out.... but still getting the same issues.
I think I will try again on a fresh ubuntu install when I get the time, as in the future I can see containerised gazebo builds coming in very handy.
Glad you got the NVIDIA CUDA support squared away, which seems to be what you were looking for.
In updating to Humble, Groot again builds. Also, I got rid of this entire NVIDIA base image and am using the regular ROS Humble image instead, mostly because NVIDIA has not released any cudagl images for Ubuntu 22.04 yet and it wasn't an important part of this demo (sorry).
Closing this out due to the Humble upgrade just being merged.
There appears to be some errors in the source code for the groot package.
I tried:
Stack trace below:
Am I doing anything wrong?