qcr / benchbot

BenchBot is a tool for seamlessly testing & evaluating semantic scene understanding tools in both realistic 3D simulation & on real robots
BSD 3-Clause "New" or "Revised" License
110 stars 12 forks source link

Can the benchbot run without the render window open? #7

Closed Hi-Zed closed 3 years ago

Hi-Zed commented 4 years ago

Hello, I would like to use the benchbot, but, currently, I don't have access to a system with the necessary hardware requirements. I was wondering, is it possible to run benchbot in a headless server with no graphical interface and connect to the simulation remotely?

Thanks

btalb commented 4 years ago

Hey @Hi-Zed; that's a really good question & a use case we would like to have a robust answer for.

We've done some preliminary digging & reaffirmed that things will only work if the remote hardware is forced to perform the simulation rendering. In essence, that rules out connecting to a remote machine and accessing the simulator via X window forwarding (i.e. ssh -X user@remote-machine) as rendering is done on the local machine when window forwarding.

By far, the easiest solution we've found is the remote machine not being truly headless (i.e. having an X server setup with hardware rendering available), and accessing this remote desktop via some sort of VNC-style protocol.

We tried this on our headless machine and got most of the way, but realised we hadn't correctly configured the fake X server to perform hardware rendering - it was trying to run Vulkan commands with software rendering which would error.

A good process for testing these things out, & isolating where issues are coming from is to:

  1. Install BenchBot as normal (the installer is entirely terminal based & will work headless).

  2. Access the machine through your remote access method.

  3. Check the GPU is accessible through your remote access method:

    u@pc:~$ docker run --rm --gpus all -it benchbot/simulator:base nvidia-smi
    Fri Jul  3 13:54:54 2020
    +-----------------------------------------------------------------------------+
    | NVIDIA-SMI 440.95.01    Driver Version: 440.95.01    CUDA Version: 10.2     |
    |-------------------------------+----------------------+----------------------+
    | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
    |===============================+======================+======================|
    |   0  GeForce GTX 1080    On   | 00000000:01:00.0  On |                  N/A |
    | 27%   27C    P8     6W / 180W |    197MiB /  8111MiB |      0%      Default |
    +-------------------------------+----------------------+----------------------+
    
    +-----------------------------------------------------------------------------+
    | Processes:                                                       GPU Memory |
    |  GPU       PID   Type   Process name                             Usage      |
    |=============================================================================|
    +-----------------------------------------------------------------------------+

    The command should output the normal text from nvidia-smi. Something like above.

  4. Confirm you can view windows on your remote machine that are generated by the Docker container:

    u@pc:~$ xhost +local:root
    non-network local connections being added to access control list
    u@pc:~$ docker run --gpus all -v /tmp/.X11-unix:/tmp/.X11-unix -it benchbot/simulator:base /bin/bash -c 'sudo apt install x11-apps && xeyes'
    ...

    A set of eyes should come up on the screen if windows are being displayed correctly.

  5. Confirm Vulkan can successfully perform rendering:

    u@pc:~$ docker run --gpus all -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY -it benchbot/simulator:base vulkaninfo

    A lot of text should print out with no errors. Then, do a final rendering check which should display a spinning cube:

    docker run --gpus all -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY -it benchbot/simulator:base vkcube

Sorry that's a lengthy list, unfortunately getting things working on a remote machine presents some difficulties with ensuring the correct part of the distributed system is performing the rendering.

Please let us know how you go with this / if you have different experiences, as we realise the hardware requirements are quite high & would like to provide any methods we can to make using BenchBot more feasible.

david2611 commented 3 years ago

We have just added a new carter_remote robot to the available robots which should be able to allow for remote access headlessly. Note that it uses OpenGL rather than Vulkan as the renderer and we don't know 100% how this may effect operation. More info in the following Wiki FAQ

Hope this helps you out :slightly_smiling_face: