erdos-project / pylot

Modular autonomous driving platform running on the CARLA simulator and real-world vehicles.
https://pylot.readthedocs.io/
Apache License 2.0
463 stars 132 forks source link

Error running in Docker mode #170

Open chrisgleeson988 opened 3 years ago

chrisgleeson988 commented 3 years ago

I tried the docker demo by ssh into cloud server ( Ubuntu 18.04 server, 64G, 8 TitanXp, CUDA 11.0 Driver 450) . I want to see the visualization so I chose to forward X method.

The steps were

terminal 1: docker pull erdosproject/pylot nvidia-docker run -itd --name pylot -p 20022:22 erdosproject/pylot /bin/bash nvidia-docker exec -i -t pylot /home/erdos/workspace/pylot/scripts/run_simulator.sh

the log is

4.24.3-0+++UE4+Release-4.24 518 0 Disabling core dumps. sh: 1: xdg-user-dir: not found ALSA lib confmisc.c:767:(parse_card) cannot find card '0' ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory ALSA lib confmisc.c:1246:(snd_func_refer) error evaluating name ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM default ALSA lib confmisc.c:767:(parse_card) cannot find card '0' ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory ALSA lib confmisc.c:1246:(snd_func_refer) error evaluating name ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM default

terminal 2: nvidia-docker cp ~/.ssh/id_rsa.pub pylot:/home/erdos/.ssh/authorized_keys nvidia-docker exec -i -t pylot sudo chown erdos /home/erdos/.ssh/authorized_keys nvidia-docker exec -i -t pylot sudo service ssh start ssh -p 20022 -X erdos@localhost cd /home/erdos/workspace/pylot/ python3 pylot.py --flagfile=configs/detection.conf --visualize_detected_obstacles

the log is

erdos@2da4247771a2:~$ cd /home/erdos/workspace/pylot/ erdos@2da4247771a2:~/workspace/pylot$ python3 pylot.py --flagfile=configs/detection.conf --visualize_detected_obstacles I0312 06:01:58.631235 139723968956224 init.py:409] $HOME=/home/erdos I0312 06:01:58.631560 139723968956224 init.py:409] matplotlib data path /home/erdos/.local/lib/python3.6/site-packages/matplotlib/mpl-data I0312 06:01:58.635867 139723968956224 init.py:1156] loaded rc file /home/erdos/.local/lib/python3.6/site-packages/matplotlib/mpl-data/matplotlibrc I0312 06:01:58.637890 139723968956224 init.py:1879] matplotlib version 2.2.4 I0312 06:01:58.637976 139723968956224 init.py:1880] interactive is False I0312 06:01:58.638494 139723968956224 init.py:1881] platform is linux I0312 06:01:58.638668 139723968956224 init.py:1882] loaded modules: ['builtins', 'sys', '_frozen_importlib', '_imp', '_warnings', '_thread', '_weakref', '_frozen_importlib_external', '_io', 'marshal', 'posix', 'zipimport', 'encodings', 'codecs', '_codecs', 'encodings.aliases', 'encodings.utf_8', '_signal', 'main', 'encodings.latin_1', 'io', 'abc', '_weakrefset', 'site', 'os', 'errno', 'stat', '_stat', 'posixpath', 'genericpath', 'os.path', '_collections_abc', '_sitebuiltins', 'sysconfig', '_sysconfigdata_m_linux_x86_64-linux-gnu', '_bootlocale', '_locale', 'types', 'funct ....

pygame 2.0.0 (SDL 2.0.12, python 3.6.9) Hello from the pygame community. https://www.pygame.org/contribute.html ALSA lib confmisc.c:767:(parse_card) cannot find card '0' ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory ALSA lib confmisc.c:1246:(snd_func_refer) error evaluating name ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM default Traceback (most recent call last): File "pylot.py", line 272, in main node_handle, control_display_stream = driver() File "pylot.py", line 229, in driver prediction_stream, waypoints_stream, control_stream) File "/home/erdos/workspace/pylot/pylot/operator_creator.py", line 818, in add_visualizer pygame.HWSURFACE | pygame.DOUBLEBUF) pygame.error: No available video device

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "pylot.py", line 285, in app.run(main) File "/home/erdos/.local/lib/python3.6/site-packages/absl/app.py", line 303, in run _run_main(main, args) File "/home/erdos/.local/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main sys.exit(main(argv)) File "pylot.py", line 280, in main shutdown_pylot(node_handle, client, world) UnboundLocalError: local variable 'node_handle' referenced before assignment

ICGog commented 3 years ago

Please try the following:

  1. Downgrade pygame from 2.0.0 to 1.9.6 (see https://github.com/erdos-project/pylot/issues/152).
  2. This error line ALSA lib confmisc.c:767:(parse_card) cannot find card '0' indicates that your X redirect doesn't work. Please test if your X redirect works correctly by running a UI application from within the container.
anupamsobti commented 3 years ago

I would like to add here that I got the same error ALSA cannot find card '0' but it wasn't related to the X redirect. My simulation ran fine with the error present. Another reason it wasn't working before was that I started the container using docker start pylot instead of nvidia-docker start pylot. Hope that helps!

chrisgleeson988 commented 3 years ago

Thanks. I guess the X forward is a different issue that I can address later. For now I want to see if pylot environment is ok or not without GUI. I pulled latest of erdos/pylot and did a manual installation on this Ubuntu 18.04 Server (python 3.7.5, pygame 1.9.6, CUDA 11.0.2 Nvidia driver 450, 8X Nvidia Titan Xp ). The nvidia-smi command gives video card information correctly. Then I ran below commands to test demo case

export CARLA_HOME=$PYLOT_HOME/dependencies/CARLA_0.9.10.1/ cd $PYLOT_HOME/scripts/ source ./set_pythonpath.sh

python3 pylot.py --flagfile=configs/demo.conf

I got following errors:

Traceback (most recent call last): File "pylot.py", line 286, in main node_handle, control_display_stream = driver() File "pylot.py", line 243, in driver prediction_stream, waypoints_stream, control_stream) File "/home/demo/github/pylot/pylot/operator_creator.py", line 799, in add_visualizer pygame.HWSURFACE | pygame.DOUBLEBUF) pygame.error: No available video device

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "pylot.py", line 299, in app.run(main) File "/home/demo/.local/lib/python3.7/site-packages/absl/app.py", line 303, in run _run_main(main, args) File "/home/demo/.local/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main sys.exit(main(argv)) File "pylot.py", line 294, in main shutdown_pylot(node_handle, client, world) UnboundLocalError: local variable 'node_handle' referenced before assignment

ICGog commented 3 years ago

The error you're getting is because pygame is not able to open a visible display. You can either address the X redirect issue or do this (http://www.pygame.org/wiki/DummyVideoDriver). Alternatively, you can try to comment out --visualize_* flags in the config file you're using.

chrisgleeson988 commented 3 years ago

Thanks. for manual installation in remote Ubuntu server the X redirect worked after I changed /etc/ssh/sshd_config X11Forwarding yes, , restart sshd service from server. From an Ubuntu20.04 desktop client I ran python3 pylot.py --flagfile=configs/demo.conf the simulation showed up.

Then I tried docker version with visualization. After run_simulator.sh, i got error:

X Error of failed request: BadValue (integer parameter out of range for operation) Major opcode of failed request: 151 (GLX) Minor opcode of failed request: 3 (X_GLXCreateContext) Value in failed request: 0x0 Serial number of failed request: 101 Current serial number in output stream: 102 terminating with uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument

I wonder if my below steps are correct?

//terminal 1 in remote server docker pull erdosproject/pylot nvidia-docker run -itd --name pylot -p 20022:22 erdosproject/pylot /bin/bash nvidia-docker exec -i -t pylot /home/erdos/workspace/pylot/scripts/run_simulator.sh

//terminal 2 in remote server nvidia-docker cp ~/.ssh/id_rsa.pub pylot:/home/erdos/.ssh/authorized_keys nvidia-docker exec -i -t pylot sudo chown erdos /home/erdos/.ssh/authorized_keys nvidia-docker exec -i -t pylot sudo service ssh start ssh -p 20022 -X erdos@localhost cd /home/erdos/workspace/pylot/ python3 pylot.py --flagfile=configs/detection.conf --visualize_detected_obstacles

Also I noticed that after I ssh to pylot docker instance the /etc/ssh/sshd_config, the X11Forwarding is default value( not yes), should this value set to yes?