allenai / ai2thor

An open-source platform for Visual AI.
http://ai2thor.allenai.org
Apache License 2.0
1.15k stars 216 forks source link

Running Ai2thor from Docker Container #931

Open moorthyrahul opened 2 years ago

moorthyrahul commented 2 years ago

Hey,

I'm trying to run Ai2thor from custom docker container (NOT via Ai2thor-docker https://github.com/allenai/ai2thor-docker) . I installed Ai2thor using pip (inside docker container) and ran the following:

from ai2thor.controller import Controller controller = Controller(scene="FloorPlan10")

File "", line 1, in File "/root/.local/lib/python3.7/site-packages/ai2thor/controller.py", line 487, in init self._build = self.find_build(local_build, commit_id, branch, platform) File "/root/.local/lib/python3.7/site-packages/ai2thor/controller.py", line 1176, in find_build % (commit_id, platform_system(), platforms_message) ValueError: Invalid commit_id: 2f8dd9f95e4016db60155a0cc18b834a6339c8e1 - no build exists for arch=Linux platforms=Linux64

System config: Ubuntu Version inside docker: 20.04 LTS Host Ubuntu version: 18.04 LTS

I tried the same on the host machine and that works fine, the issue is while running from inside docker

Kindly let me know what's causing the error?

ekolve commented 2 years ago

Can you confirm that you have networking configured? You will receive that error if you can't connect to http://s3-us-west-2.amazonaws.com (a better exception will be added). You can also check if you can download: http://s3-us-west-2.amazonaws.com/ai2-thor-public/builds/thor-Linux64-2f8dd9f95e4016db60155a0cc18b834a6339c8e1.zip You may require a proxy to connect out.

I strongly recommend using https://github.com/allenai/ai2thor-docker as a template to get ai2thor running. The latest revision to the repo uses a version that doesn't require Xorg. Otherwise you will need to configure Xorg within the docker container (or pass in the unix socket for the host's Xorg server).

moorthyrahul commented 2 years ago

Thank you for your response. I believe it was a proxy related issue and I'm not getting that error anymore. Although since I'm running it from a custom docker container, I did enable X11 forwarding and setup Xorg from the container.

Once I run the following (to start Ai2thor) from inside the container:

Python 3.7.6 (default, Jan 8 2020, 19:59:22) [GCC 7.3.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information.

from ai2thor.controller import Controller controller = Controller() thor-Linux64-2f8dd9f95e4016db60155a0cc18b834a6339c8e1.zip: [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% 749.1 KiB/s] of 481.MB

I don't get a response after the zip download nor does Unity startup for visualization/rendering. Any idea why is this happening? And from the host machine, $DISPLAY is supported.

ekolve commented 2 years ago

You can check the log ~/.config/unity3d/Allen\ Institute\ for\ Artificial\ Intelligence/AI2-THOR/Player.log within the container for errors.

moorthyrahul commented 2 years ago

Thank you for the response. I don't see the Player.log file created while running Ai2thor from the container. In the meantime, I had 2 follow-up questions:

1) Since I'm trying to run this from a custom docker container, could you help suggest on how to setup the Xorg within the docker container (or pass in the unix socket for the host's Xorg server)?

2) I also tried setting up the Ai2thor-docker (https://github.com/allenai/ai2thor-docker). While building the docker container (running /scripts/build.sh), in Step 11/13 : RUN NVIDIA_VERSION=$NVIDIA_VERSION /app/install_nvidia.sh I get the following error:

--2021-11-09 08:49:53-- (try: 2) http://us.download.nvidia.com/XFree86/Linux-x86_64/455.32.00/NVIDIA-Linux-x86_64-455.32.00.run Connecting to us.download.nvidia.com (us.download.nvidia.com)|192.229.211.70|:80... failed: Connection timed out. Connecting to us.download.nvidia.com (us.download.nvidia.com)|2606:2800:21f:3aa:dcf:37b:1ed6:1fb|:80... failed: Cannot assign requested address. Retrying.

It tries to build a docker container with CUDA version 9.2 and NVIDIA Driver: 455.32.00 based on my system config

Could you advice with possible solutions to the above queries. Thanks!

ekolve commented 2 years ago

Could you send me the output of the env command when you run that from the host machine? I believe that error is also a proxy issue and the parameter needs to be passed to docker during the build process.

moorthyrahul commented 2 years ago

This is the output from env command from the host machine:

CLUTTER_IM_MODULE=xim CONDA_SHLVL=1 LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:.tar=01;31:.tgz=01;31:.arc=01;31:.arj=01;31:.taz=01;31:.lha=01;31:.lz4=01;31:.lzh=01;31:.lzma=01;31:.tlz=01;31:.txz=01;31:.tzo=01;31:.t7z=01;31:.zip=01;31:.z=01;31:.Z=01;31:.dz=01;31:.gz=01;31:.lrz=01;31:.lz=01;31:.lzo=01;31:.xz=01;31:.zst=01;31:.tzst=01;31:.bz2=01;31:.bz=01;31:.tbz=01;31:.tbz2=01;31:.tz=01;31:.deb=01;31:.rpm=01;31:.jar=01;31:.war=01;31:.ear=01;31:.sar=01;31:.rar=01;31:.alz=01;31:.ace=01;31:.zoo=01;31:.cpio=01;31:.7z=01;31:.rz=01;31:.cab=01;31:.wim=01;31:.swm=01;31:.dwm=01;31:.esd=01;31:.jpg=01;35:.jpeg=01;35:.mjpg=01;35:.mjpeg=01;35:.gif=01;35:.bmp=01;35:.pbm=01;35:.pgm=01;35:.ppm=01;35:.tga=01;35:.xbm=01;35:.xpm=01;35:.tif=01;35:.tiff=01;35:.png=01;35:.svg=01;35:.svgz=01;35:.mng=01;35:.pcx=01;35:.mov=01;35:.mpg=01;35:.mpeg=01;35:.m2v=01;35:.mkv=01;35:.webm=01;35:.ogm=01;35:.mp4=01;35:.m4v=01;35:.mp4v=01;35:.vob=01;35:.qt=01;35:.nuv=01;35:.wmv=01;35:.asf=01;35:.rm=01;35:.rmvb=01;35:.flc=01;35:.avi=01;35:.fli=01;35:.flv=01;35:.gl=01;35:.dl=01;35:.xcf=01;35:.xwd=01;35:.yuv=01;35:.cgm=01;35:.emf=01;35:.ogv=01;35:.ogx=01;35:.aac=00;36:.au=00;36:.flac=00;36:.m4a=00;36:.mid=00;36:.midi=00;36:.mka=00;36:.mp3=00;36:.mpc=00;36:.ogg=00;36:.ra=00;36:.wav=00;36:.oga=00;36:.opus=00;36:.spx=00;36:.xspf=00;36: CONDA_EXE=/home/hwhiaiuser/anaconda3/bin/conda PULSE_CLIENTCONFIG=/home/hwhiaiuser/.nx/nxdevice/D-1002-BA5E27BB1CBC440A36552AB54CF7AFC0/audio/client.conf NXDIR=/usr/NX LESSCLOSE=/usr/bin/lesspipe %s %s XDG_MENU_PREFIX=gnome- PULSE_SERVER=/home/hwhiaiuser/.nx/nxdevice/D-1002-BA5E27BB1CBC440A36552AB54CF7AFC0/audio/native.socket LANG=en_CA.UTF-8 cntlm_port=3128 NX_CUPS_BIN=/usr/bin DISPLAY=:1002 NX_CLIENT=/usr/NX/bin/nxclient GNOME_SHELL_SESSION_MODE=ubuntu COLORTERM=truecolor PULSE_CONFIG=/home/hwhiaiuser/.nx/nxdevice/D-1002-BA5E27BB1CBC440A36552AB54CF7AFC0/audio/daemon.conf USERNAME=hwhiaiuser CONDA_PREFIX=/home/hwhiaiuser/anaconda3 NX_SESSION_ID=BA5E27BB1CBC440A36552AB54CF7AFC0 _CE_M= XDG_SESSION_ID=c2 USER=hwhiaiuser DESKTOP_SESSION=ubuntu-xorg NX_SYSTEM=/usr/NX QT4_IM_MODULE=xim TEXTDOMAINDIR=/usr/share/locale/ GNOME_TERMINAL_SCREEN=/org/gnome/Terminal/screen/0629e175_8ba4_4307_af02_c2ff2054e780 PULSE_SCRIPT=/home/hwhiaiuser/.nx/nxdevice/D-1002-BA5E27BB1CBC440A36552AB54CF7AFC0/audio/default.pa PWD=/home/hwhiaiuser/ai2thor-docker HOME=/home/hwhiaiuser PULSE_RUNTIME_PATH=/run/user/1000/pulse CONDA_PYTHON_EXE=/home/hwhiaiuser/anaconda3/bin/python TEXTDOMAIN=im-config NX_ROOT=/home/hwhiaiuser/.nx https_proxy=http://127.0.0.1:3128 XDG_DATA_DIRS=/usr/share/ubuntu-xorg:/usr/local/share:/usr/share:/var/lib/snapd/desktop http_proxy=http://127.0.0.1:3128 _CE_CONDA= cntlm_proxy=127.0.0.1:3128 GJS_DEBUG_OUTPUT=stderr no_proxy=127.0.0.,.huawei.com,localhost CONDA_PROMPT_MODIFIER=(base) MAIL=/var/mail/hwhiaiuser TERM=xterm-256color VTE_VERSION=5202 SHELL=/bin/bash QT_IM_MODULE=ibus XMODIFIERS=@im=ibus IM_CONFIG_PHASE=2 XDG_CURRENT_DESKTOP=ubuntu:GNOME GNOME_TERMINAL_SERVICE=:1.71 cntlm_host=127.0.0.1 SHLVL=2 LANGUAGE=en_CA:en GDK_BACKEND=x11 GNOME_DESKTOP_SESSION_ID=this-is-deprecated LOGNAME=hwhiaiuser DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-xwZDRwPN0j,guid=2bfc947b7420f9f75f0fce046189d088 XDG_RUNTIME_DIR=/run/user/1000 XAUTHORITY=/home/hwhiaiuser/.Xauthority XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu-xorg:/etc/xdg PATH=/home/hwhiaiuser/anaconda3/bin:/home/hwhiaiuser/anaconda3/condabin:/home/hwhiaiuser/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin CONDA_DEFAULT_ENV=base NX_TEMP=/tmp GJS_DEBUG_TOPICS=JS ERROR;JS LOG ftp_proxy=http://127.0.0.1:3128 SESSION_MANAGER=local/hwhiaiuser-Alienware-Aurora-R7:@/tmp/.ICE-unix/27776,unix/hwhiaiuser-Alienware-Aurora-R7:/tmp/.ICE-unix/27776 NX_CONNECTION=10.209.203.20 51936 10.209.201.29 4000 LESSOPEN=| /usr/bin/lesspipe %s GTK_IMMODULE=ibus =/usr/bin/env

The host machine is using company network. Let me know if any parameters are to passed while building/running the docker container

ekolve commented 2 years ago

Can you try updating the file scripts/build.sh - modify this line:

docker build --build-arg NVIDIA_VERSION=$NVIDIA_VERSION --build-arg CUDA_VERSION=$CUDA_VERSION -t ai2thor-docker:latest .

to this: docker build --build-arg=http_proxy=$http_proxy --build-arg=https_proxy=$https_proxy --build-arg NVIDIA_VERSION=$NVIDIA_VERSION --build-arg CUDA_VERSION=$CUDA_VERSION -t ai2thor-docker:latest .

moorthyrahul commented 2 years ago

Thanks for your suggestion. I did build and run the Ai2thor-docker container on my machine, but still unable to visualize the output from inside the container? While running the example_agent.py (https://github.com/allenai/ai2thor-docker) its stuck indefinitely after initializing the controller object.

I tried using both CloudRendering and Linux64 platforms while building the container, but there's no change. From my understanding, there's no requirement to setup Xorg or $DISPLAY while using Ai2thor docker image.

Could you kindly suggest what can be done?