Closed weihunko closed 5 years ago
We have never tried running Simulator in docker. Do you have specific need to run inside docker? If not, I suggest you not to.
In case you need to run simulator inside docker, you'll need to figure out how to use vulkan from inside of docker. nvidia-docker does not support forwarding Vulkan from host. See https://github.com/NVIDIA/nvidia-docker/wiki/Frequently-Asked-Questions#is-vulkan-supported
To make this work it will require installing nvidia drivers inside docker and allowing full access to nvidia device on host (not using nvidia-docker). Here's more information on this: https://stackoverflow.com/a/25367554 We have not tried running this, so I cannot say for sure how to make this work.
Thanks for the quick response! The reason I was trying to launch LGSVL simulator insdie docker is because our application also utilizes containers and I saw those two projects:
However, I may have some misunderstanding here. Is simulator actually running outside docker in lanefollowing project and autoware demo?
We always run simulator outside of docker. For lanefollowing demo only its python scripts run inside docker. These scripts receive image from simulator over ROS2 and does DNN for lane recognition, and then sends back control command over ROS2 to simulator.
From reading the script, I assume the ros2_bridge also runs inside docker? and the bridge is able to talk to simulator without changing settings?
Yes, ros2 bridge runs inside docker in same ROS2 environment as AD stack. Bridge will listen on TCP 9090 port websocket connection. Simulator connects to this port to talk to ROS2. All the docker container does is exposes port 9090 to host. This way we ware able to talk to any ROS 1 or 2 environment.
Thanks a lot! That should work with our application. I will give it a try.
we actually tried to run the previous version (2019.04) lg sim in Docker, it works. but not sure if still works in the new version with Vulkan
I confirm that the 2019.05 works in docker. With the latest 2019.09 release, I managed to launch from within a docker container the web interface, access it from the outside, and download the assets. Though, due to #366, I couldn't go any further.
@diegoferigo
I just tried to migrate from 2019.04 to 2019.09(with vulkan), there is a project docker-nvidia-vulkan, which may be a good try. by default, vulkan is not supported in Docker(18.9 or 19.03).
@david-gwa Thanks a lot for the link! I originally though that vulkan required some work to the nvidia runtime. I will have a look to the Dockerfile, let's see if I can get something running. Unfortunately, as I mentioned before, my GPU is not good enough to run the latest simulator version neither in my host :/
Keep us updated about your progresses!
Yeah, vulkan support for nvidia-runtime would be ideal solution, less configuration would be necessary.
That github repository is doing same thing as I suggested above - install nvidia driver inside container (make sure it is exactly same version as on host) and then it should work. Docker image would not be very portable across different host machines, but it would work when built locally.
This was the way how you used to get cuda or opengl support in containers before nvidia-runtime was created. And there are no big differences between how vulkan interacts with kernel driver, so it should work same way.
That github repository is doing same thing as I suggested above - install nvidia driver inside container (make sure it is exactly same version as on host) and then it should work. Docker image would not be very portable across different host machines, but it would work when built locally.
Unless there's something I don't know, why not using the official cuda images as base and just add the vulkan SDK? (disclaimer: I'm not familiar at all with vulkan). Together with the new nvidia support integrated in the 2019.03 docker version, the resulting image would have way less hardcoded components.
Vulkan SDK does not matter. That is for SDK. What you want is runtime. But the problem is not runtime, The problem is that nvidia drivers are made in a way that versions of client space libraries you use in your process must match kernel modules versions. This is same for cuda, opengl, and I assume vulkan. The question is - how you make version of libraries inside container match version of kernel modules on host? One way is to install driver inside docker image. That works fine, but is not very portable. That's why nvidia created nvidia-docker runtime, which basically is just a hook at startup of container - it simply copies some .so files from your host into newly created container - thus having exactly same versions of libraries. And your nvidia code works. The problem, no idea why, is that they did not do this for vulkan. Only OpenGL and cuda libraries. That's why you cannot use vulkan with nvidia-docker docker --gpu support does similar thing. I think they standartized the way you do these hooks, so it works not only with nvidia, but also other things (but I don't know much details about new docker). There is nothing really else docker itself is doing to make GPU available inside container. So unless new docker --gpus support added these hooks for vulkan libraries, it won't help.
Thanks for the insight @martins-mozeiko, now it's more clear. Maybe either @renaudwastaken or @flx42 could add some more detail from upstream?
We publish some Vulkan images that in conjunction with the --gpus
option will expose some of the libraries inside your container :)
Oh, this looks pretty simple. All you need to do is put vulkan loader & icd files in correct places. Thanks @RenaudWasTaken, this is very useful!
@RenaudWasTaken The docker pull command maybe wrong:
docker pull nvidia/vulkan => docker pull nvidia/vulkan:1.1.121-cuda-10.1-alpha
There isn't a latest tag yet, this is why you can't pull with an empty tag.
We will likely publish it with the beta release.
There isn't a latest tag yet, this is why you can't pull with an empty tag.
We will likely publish it with the beta release.
Hey, @daohu527
I actually use this gitlab: nvidia/contianer-images, looks nvidia team has maintain a few version of vulkan images there.
maybe you can find the one works on your GPU
@mark-gerow-lge Hi there. I tried to use the vulkan image provided by RenaudWasTaken. However the Ubuntu desktop manager crashed. Have you or anyone else given it a try?
@left4taco If your whole ubuntu desktop crashed then please check the logs from X11 and/or kernel. It sounds like issue wither with GPU hardware and or nvidia driver.
Btw, we have our official instructions for running Simulator inside Docker in git now: https://github.com/lgsvl/simulator/tree/master/Docker
@martins-mozeiko Thanks. I didn't know that it's already officially supported!
I just gave it a try. It looks like I need to be root to run it. Though the reason is unknown, it's good enough now!
You don't need to be root to use Simulator. It works under any user as long as you have not run it with different user already - as it creates settings files under ~/.config
folder. So next time it need to be run with same user.
Check if you have correctly installed docker for your user to avoid using root: https://docs.docker.com/install/linux/linux-postinstall/
I have tried to run
rviz
andglxgears
inside container and they worked fine, but I failed to start the LGSVL simulator inside docker. Here is the log message.and the nvidia-smi output is as below
I can launch LGSVL simulator outside docker on my host machine. Here is the log message for running simulator outside docker.
run nvidia-smi outside docker container