Open dpascualhe opened 1 day ago
Right now the set_dri_name.sh does not make much sense, because for it to work properly you need to specify if you want to launch the docker without gpu, with Intel or with Nvidia.
Unless there is an advanced user who wants to choose a specific architecture, if the script is launched with no arguments it prioritizes Nvidia, then Intel, then any other GPU, and if none are found then the DRI_NAME
variable is left empty and no acceleration will be performed.
Also executing gives the next error if the docker is launched with no gpu or only Intel:
source set_dri_name.sh bash: lspci: command not found Warning: nvidia-smi not available or failed, skipping NVIDIA GPU. Error: No GPU found for the vendor ''.
It's yielding that error because it can't run lspci
. I added pcituils
as a dependency in the dockerfiles here in RoboticsAcademy (please let me know if that is the right place). Could you install pciutils and then run the script? It should be working then.
In any case, I'll update the script to improve the error management.
And also I'm not able of executing the check_gpu.sh :
Running glmark2 on GPU: /dev/dri/card1 (PCI: 00:02.0, Name: Intel Corporation Device a788 (rev 04)) Error: main: Could not initialize canvas Running glmark2 on GPU: /dev/dri/card0 (PCI: 01:00.0, Name: NVIDIA Corporation Device 28e0 (rev a1)) Error: main: Could not initialize canvas
Which docker command are you using to run the container?
Thanks so much for taking the time of reviewing the PR! :hugs:
Using the command:
sudo apt-get install pciutils
Seems to solve the ./set_dri_name script and now it works properly.
For the second part, I am launching the docker container with the develop_academy.sh script found in RoboticsAcademy that uses docker compose with the files found in here.
This PR includes scripts related with GPU managing in the RoboticsBackend. ONLY TESTED IN LINUX.
set_dri_name.sh
This script sets the
DRI_NAME
environment variable depending on the available GPUs, which could reduce thedocker run
command complexity by freeing users from having to check which devices are available and manually setting their paths (it is meant to be called within entrypoint.sh). The only extra dependency ispciutils
.It yields traces about whether or not a GPU has been selected and its vendor. By default, the script prioritizes Nvidia, then Intel, then anything else. This default behavior can be overriden by passing the preferred vender as argument (e.g.
source set_dri_name.sh amd
)Real example in a laptop with dual GPU configuration:
check_gpu.sh
This script benchmarks all available GPUs using glmark2. First, it installs missing dependencies (pciutils and glmark2), and then proceeds to run said benchmark for each found device. It is helpful in two scenarios:
vglrun
is used and will complain if there's anything wrong with any of the available GPUs.For running this script, which is meant to be a tool for developers, the docker run command must be adapted:
$ xhost +local: && docker run --rm -e DISPLAY=$DISPLAY --gpus all --device /dev/dri --net host -it --entrypoint /bin/bash jderobot/robotics-backend
Once inside the docker, you can launch the benchmarking tool by running:source check_gpu.sh
An example of running such benchmark can be seen in the following video: https://youtu.be/ZBXO3J_wgcg
Full output: