lgsvl / simulator

A ROS/ROS2 Multi-robot Simulator for Autonomous Vehicles
Other
2.28k stars 781 forks source link

Perception & Traffic Light Modules Turn off Automatically when following the 'End-to-end video tutorial of LGSVL simulator with Apollo driving' Tutorial #1601

Closed schromotion closed 3 years ago

schromotion commented 3 years ago

System information

SVL installed from binary with steps from the installation guide

Steps to reproduce the issue:

ISSUE: I'm trying to follow the tutorial in Tutorial: End-to-end video tutorial of LGSVL simulator with Apollo driving but hitting an issue when running Apollo connected to the SVL simulator where the "Perception" and "Traffic Light" modules keep automatically switching off.

Following the tutorial's linked markdown file, it begins to get interesting here:

The output of this command gives image ... image

Hypothesis:

  1. GPU is too small 1050 Mobile vs 1080 recommended (could be the issue, not sure how to confirm)
  2. Nvidia drivers in the docker container don't match the drivers installed on the host (however, they are both at 460.80 with Cuda 11.2)
  3. A docker group issue. The tutorial is clear that dev_start.sh should not be run with sudo. Even when running without sudo, the password is still requested and I've followed docker's post install steps as described above. (Not sure how to diagnose further)
  4. Nvidia-docker is not installed in the apollo docker. The tutorial never indicated a step to install this package inside of the docker - but maybe there is some sort of GPU issue inside the container. (Not sure how to diagnose further)
    kellen@in_5_0_dev_docker:/apollo$ nvidia-docker version
    bash: nvidia-docker: command not found

Super excited about this simulator and learning more! Let me know if I had any issues or mistakes along the way or if there is a good path to finish the tutorial out from here.

ks

lemketron commented 3 years ago

Please note this video and related markdown document is from July, 2020 and refers to LGSVL Simulator 2020.06 used with a pre-release version of Apollo 6.0.

SVL installed from binary with steps from the installation guide

  • Apollo version (3.5, 5.0, 5.5, 6.0): image

This is good as SVL Simulator 2021.2 (or later) is what you should be using.

However, it looks like you are somehow using Apollo 5.0 from the simulator branch of the lgsvl fork of Apollo 5.0, rather than Apollo (upstream) 6.0 or master.

  • Step: The tutorial indicates Dreamview should be started with ./scripts/bootstrap_lgsvl.sh , but my branch only has bootstrap.sh in the scripts folder so I start Dreamview with this. Dreamview loads and I selected the "Mkz Standard Debug" option from the setup drop down (different from the tutorial's selection). All of the modules outlined in the tutorial turn on except 'Perception' and 'Traffic Lights', which turn off shortly after toggling on.

This is because bootstrap_lgsvl.sh is the lgsvl variant of the bootstrap.sh script in the Apollo upstream repository. In the lgsvl fork of Apollo 5.0 it's simply called bootstrap.sh. The fact that you don't have bootstrap_lgsvl.sh indicates that you are are not using upstream Apollo. You can also confirm this by typing git remote -v in your apollo directory to see where it was cloned from (but the simulator branch is a clue that it is most probably the lgsvl fork of Apollo 5.0).

Hypothesis:

  1. GPU is too small 1050 Mobile vs 1080 recommended (could be the issue, not sure how to confirm)

Yes, this GPU is on the low end for running ONLY SVL Simulator with a small and simple map. If you are running any complex environments OR trying to run Apollo on the same machine, 4GB in the 1050 is just not enough memory. You can see that in nvidia-smi which will show you that the memory is consumed by the simulator, your browser, and likely not leaving enough memory for Apollo's prediction module (even if you're not planning to use Apollo perception).

  1. Nvidia drivers in the docker container don't match the drivers installed on the host (however, they are both at 460.80 with Cuda 11.2)
  2. Nvidia-docker is not installed in the apollo docker. The tutorial never indicated a step to install this package inside of the docker - but maybe there is some sort of GPU issue inside the container. (Not sure how to diagnose further)

It's been a while since I did a fresh checkout and build of Apollo but I don't recall any concerns with docker or nvidia-docker inside the apollo docker nor having to install anything there. The main concern is to have the correct software installed on the host.

  1. A docker group issue. The tutorial is clear that dev_start.sh should not be run with sudo. Even when running without sudo, the password is still requested and I've followed docker's post install steps as described above. (Not sure how to diagnose further)

The sudo password will still be requested when running dev_start.sh as a (non-root) user. This is normal and to be expected. You should not need to run dev_start.sh with sudo.

Summary:

I think you are trying to run Apollo 5.0, but are also trying to use "modular testing" to avoid needing to run Apollo obstacle perception (and traffic light perception). "Modular testing" requires (upstream) Apollo 6.0 or (better still) Apollo master.

Please clone (and re-build) Apollo master from https://github.com/ApolloAuto/apollo and refer to the following for more information: https://www.svlsimulator.com/docs/system-under-test/apollo-master-instructions/ https://www.svlsimulator.com/docs/tutorials/modular-testing/

schromotion commented 3 years ago

Super helpful! I've since been able to get the sim running with ground truth 3d and ground truth traffic lights which is a great spot until I get a bigger GPU. Thanks again!

lemketron commented 3 years ago

Super helpful! I've since been able to get the sim running with ground truth 3d and ground truth traffic lights which is a great spot until I get a bigger GPU. Thanks again!

Glad to hear you figured it out. A bigger/newer GPU will undoubtedly help but I'm glad you got it working in the mean time!