chuangg / tdw-transport-challenge-starter-code

27 stars 2 forks source link

Problem when running in docker #8

Closed meier-johannes94 closed 3 years ago

meier-johannes94 commented 3 years ago

I tried to run the code in docker for a slightly modified setting. I get the following problem:

jmeier@eiturtindur:~/tdw/docker_setup/tdw-transport-challenge-starter-code$ nvidia-docker run  --network none --env="DISPLAY=:1" --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" submission_image sh run_hello_world.sh
Set current directory to /
Found path: /TDW/TDW.x86_64
hello world
Traceback (most recent call last):
  File "hello_world.py", line 49, in <module>
    physics=True, port=7845, launch_build=False)
  File "/miniconda/envs/transport_challenge_env/lib/python3.7/site-packages/gym/envs/registration.py", line 145, in make
    return registry.make(id, **kwargs)
  File "/miniconda/envs/transport_challenge_env/lib/python3.7/site-packages/gym/envs/registration.py", line 90, in make
    env = spec.make(**kwargs)
  File "/miniconda/envs/transport_challenge_env/lib/python3.7/site-packages/gym/envs/registration.py", line 60, in make
    env = cls(**_kwargs)
TypeError: __init__() got an unexpected keyword argument 'launch_build'
ERROR conda.cli.main_run:execute(33): Subprocess for 'conda run ['python', 'hello_world.py']' command failed.  (See above for error)

Then I modified the parameter and I got:

Traceback (most recent call last):
  File "hello_world.py", line 49, in <module>
    physics=True, port=7845)  # , launch_build=False)
  File "/miniconda/envs/transport_challenge_env/lib/python3.7/site-packages/gym/envs/registration.py", line 145, in make
    return registry.make(id, **kwargs)
  File "/miniconda/envs/transport_challenge_env/lib/python3.7/site-packages/gym/envs/registration.py", line 90, in make
    env = spec.make(**kwargs)
  File "/miniconda/envs/transport_challenge_env/lib/python3.7/site-packages/gym/envs/registration.py", line 60, in make
    env = cls(**_kwargs)
  File "/tdw-transport-challenge/tdw_transport_challenge/tdw_gym.py", line 44, in __init__
    train = train, exp = exp, fov = 90)
  File "/tdw-transport-challenge/tdw_transport_challenge/controller.py", line 32, in __init__
    screen_width=screen_size, screen_height=screen_size, fov = fov, check_pypi_version=False)
TypeError: __init__() got an unexpected keyword argument 'check_pypi_version'

So I worried, whether it is up-to-date. It was updated 5 months ago. Also it reads like it is using tdw 1.8.4. https://hub.docker.com/r/transportchallenge/transport_challenge_2021 Or am I making a mistake here?

alters-mit commented 3 years ago

@meier-johannes94 What version of Magnebot are you using? Please send the output of pip3 show magnebot

@abhi1092 Can you check the image's version of TDW? Should be 1.8.7

abhi1092 commented 3 years ago

Sorry just saw this. I have uploaded the new image with TDW 1.8.7.

meier-johannes94 commented 3 years ago

@abhi1092 Thank you!

@alters-mit I am not an expert with docker yet, so it took me a while now.

Simply running pip3 seems not to work: docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "pip3": executable file not found in $PATH: unknown.

So I was exporting it via conda, but unfortunately still not what we need I think:

name: base
channels:
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - _openmp_mutex=4.5=1_gnu
  - brotlipy=0.7.0=py39h27cfd23_1003
  - ca-certificates=2021.10.26=h06a4308_2
  - certifi=2021.10.8=py39h06a4308_0
  - cffi=1.14.6=py39h400218f_0
  - charset-normalizer=2.0.4=pyhd3eb1b0_0
  - conda=4.10.3=py39h06a4308_0
  - conda-package-handling=1.7.3=py39h27cfd23_1
  - cryptography=35.0.0=py39hd23ed53_0
  - idna=3.2=pyhd3eb1b0_0
  - ld_impl_linux-64=2.35.1=h7274673_9
  - libffi=3.3=he6710b0_2
  - libgcc-ng=9.3.0=h5101ec6_17
  - libgomp=9.3.0=h5101ec6_17
  - libstdcxx-ng=9.3.0=hd4cf53a_17
  - ncurses=6.2=he6710b0_1
  - openssl=1.1.1l=h7f8727e_0
  - pycosat=0.6.3=py39h27cfd23_0
  - pycparser=2.20=py_2
  - pyopenssl=21.0.0=pyhd3eb1b0_1
  - pysocks=1.7.1=py39h06a4308_0
  - python=3.9.5=h12debd9_4
  - readline=8.1=h27cfd23_0
  - requests=2.26.0=pyhd3eb1b0_0
  - ruamel_yaml=0.15.100=py39h27cfd23_0
  - setuptools=58.0.4=py39h06a4308_0
  - six=1.16.0=pyhd3eb1b0_0
  - sqlite=3.36.0=hc218d9a_0
  - tk=8.6.11=h1ccaba5_0
  - tqdm=4.62.3=pyhd3eb1b0_1
  - tzdata=2021e=hda174b7_0
  - urllib3=1.26.7=pyhd3eb1b0_0
  - xz=5.2.5=h7b6447c_0
  - yaml=0.2.5=h7b6447c_0
  - zlib=1.2.11=h7b6447c_3
name: transport_challenge_env
channels:
  - conda-forge
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - _openmp_mutex=4.5=1_gnu
  - blas=1.0=mkl
  - bzip2=1.0.8=h7b6447c_0
  - ca-certificates=2021.10.8=ha878542_0
  - cairo=1.16.0=hf32fb01_1
  - certifi=2021.10.8=py37h89c1867_0
  - ffmpeg=4.0=hcdf2ecd_0
  - fontconfig=2.13.1=h6c09931_0
  - freeglut=3.0.0=hf484d3e_5
  - freetype=2.11.0=h70c0345_0
  - glib=2.69.1=h5202010_0
  - graphite2=1.3.14=h23475e2_0
  - harfbuzz=1.8.8=hffaf4a1_0
  - hdf5=1.10.2=hba1933b_1
  - icu=58.2=he6710b0_3
  - intel-openmp=2021.3.0=h06a4308_3350
  - jasper=2.0.14=hd8c5072_2
  - joblib=1.1.0=pyhd8ed1ab_0
  - jpeg=9d=h7f8727e_0
  - ld_impl_linux-64=2.35.1=h7274673_9
  - libblas=3.9.0=11_linux64_mkl
  - libcblas=3.9.0=11_linux64_mkl
  - libffi=3.3=he6710b0_2
  - libgcc-ng=9.3.0=h5101ec6_17
  - libgfortran-ng=7.5.0=ha8ba4b0_17
  - libgfortran4=7.5.0=ha8ba4b0_17
  - libglu=9.0.0=hf484d3e_1
  - libgomp=9.3.0=h5101ec6_17
  - liblapack=3.9.0=11_linux64_mkl
  - libopencv=3.4.2=hb342d67_1
  - libopus=1.3.1=h7b6447c_0
  - libpng=1.6.37=hbc83047_0
  - libstdcxx-ng=9.3.0=hd4cf53a_17
  - libtiff=4.2.0=h85742a9_0
  - libuuid=1.0.3=h7f8727e_2
  - libvpx=1.7.0=h439df22_0
  - libwebp-base=1.2.0=h27cfd23_0
  - libxcb=1.14=h7b6447c_0
  - libxml2=2.9.12=h03d6c58_0
  - lz4-c=1.9.3=h295c915_1
  - mkl=2021.3.0=h06a4308_520
  - mkl-service=2.4.0=py37h7f8727e_0
  - mkl_fft=1.3.1=py37hd3c417c_0
  - mkl_random=1.2.2=py37h51133e4_0
  - ncurses=6.2=he6710b0_1
  - numpy=1.21.2=py37h20f2e39_0
  - numpy-base=1.21.2=py37h79a1101_0
  - opencv=3.4.2=py37h6fd60c2_1
  - openssl=1.1.1l=h7f8727e_0
  - pcre=8.45=h295c915_0
  - pixman=0.40.0=h7f8727e_1
  - py-opencv=3.4.2=py37hb342d67_1
  - python=3.7.11=h12debd9_0
  - python_abi=3.7=2_cp37m
  - readline=8.1=h27cfd23_0
  - scikit-learn=0.24.2=py37h18a542f_0
  - scipy=1.5.3=py37h8911b10_0
  - setuptools=58.0.4=py37h06a4308_0
  - six=1.16.0=pyhd3eb1b0_0
  - sqlite=3.36.0=hc218d9a_0
  - threadpoolctl=3.0.0=pyh8a188c0_0
  - tk=8.6.11=h1ccaba5_0
  - wheel=0.37.0=pyhd3eb1b0_1
  - xz=5.2.5=h7b6447c_0
  - zlib=1.2.11=h7b6447c_3
  - zstd=1.4.9=haebb681_0
  - pip:
    - altgraph==0.17.2
    - boto3==1.19.7
    - botocore==1.22.7
    - charset-normalizer==2.0.7
    - cloudpickle==2.0.0
    - cycler==0.11.0
    - gym==0.21.0
    - idna==3.3
    - ikpy==3.1
    - imageio==2.10.1
    - importlib-metadata==4.8.1
    - jmespath==0.10.0
    - kiwisolver==1.3.2
    - matplotlib==3.4.3
    - mpmath==1.2.1
    - overrides==6.1.0
    - pandas==1.3.4
    - pillow==8.4.0
    - pip==21.3.1
    - psutil==5.8.0
    - py-md-doc==0.2.4
    - pyastar2d==1.0.2
    - pyinstaller==4.6
    - pyinstaller-hooks-contrib==2021.3
    - pymongo==3.12.1
    - pyparsing==3.0.4
    - python-dateutil==2.8.2
    - pytz==2021.3
    - pyzmq==22.3.0
    - requests==2.26.0
    - s3transfer==0.5.0
    - sympy==1.9
    - tdw==1.8.7.0
    - tqdm==4.62.3
    - typing-extensions==3.10.0.2
    - typing-utils==0.1.0
    - urllib3==1.26.7
    - zipp==3.6.0
prefix: /miniconda/envs/transport_challenge_env

Because of the changes of @abhi1092 I was pulling the image again, running again and got this error now:

jmeier@eiturtindur:~/tdw/docker_setup/tdw-transport-challenge-starter-code$ nvidia-docker run --network none --env="DISPLAY=:4" --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" --volume="/tmp/output:/results" -e NVIDIA_DRIVER_CAPABILITIES=all -e TRANSPORT_CHALLENGE=file:////model_library -e NUM_EVAL_EPISODES=1 submission_image sh run_baseline_agent.sh 7845
Set current directory to /
Found path: /TDW/TDW.x86_64
Traceback (most recent call last):
  File "agent.py", line 62, in <module>
    main()
  File "agent.py", line 54, in main
    challenge = Challenge(logger, args.port)
  File "/tdw-transport-challenge/tdw_transport_challenge/challenge.py", line 17, in __init__
    port = port)
  File "/miniconda/envs/transport_challenge_env/lib/python3.7/site-packages/gym/envs/registration.py", line 235, in make
    return registry.make(id, **kwargs)
  File "/miniconda/envs/transport_challenge_env/lib/python3.7/site-packages/gym/envs/registration.py", line 129, in make
    env = spec.make(**kwargs)
  File "/miniconda/envs/transport_challenge_env/lib/python3.7/site-packages/gym/envs/registration.py", line 89, in make
    cls = load(self.entry_point)
  File "/miniconda/envs/transport_challenge_env/lib/python3.7/site-packages/gym/envs/registration.py", line 27, in load
    mod = importlib.import_module(mod_name)
  File "/miniconda/envs/transport_challenge_env/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/tdw-transport-challenge/tdw_transport_challenge/tdw_gym.py", line 6, in <module>
    from tdw_transport_challenge.controller import Basic_controller as Controller
  File "/tdw-transport-challenge/tdw_transport_challenge/controller.py", line 3, in <module>
    from magnebot import Arm
  File "/magnebot/magnebot/__init__.py", line 1, in <module>
    from .magnebot_controller import Magnebot
  File "/magnebot/magnebot/magnebot_controller.py", line 14, in <module>
    from tdw.output_data import OutputData, Version, StaticRobot, SegmentationColors, Bounds, Rigidbodies, LogMessage,\
ImportError: cannot import name 'MagnebotWheels' from 'tdw.output_data' (/miniconda/envs/transport_challenge_env/lib/python3.7/site-packages/tdw/output_data.py)

I saw that you made the changes here: https://github.com/alters-mit/magnebot/commit/42452f29836c405a9120241bb46e4c9c894496f2, therefore it must be magnebot 1.3.2 that is used. And not 1.3.1, which we need.

I am not sure, how to modify it in docker that we use 1.3.1 instead since your image is used as a basis: https://hub.docker.com/r/transportchallenge/transport_challenge_2021 You probably get the same error, when running it in docker? Is it possible that you update the docker image to get the correct magnebot version? Is it maybe also possible that you do a quick test such that we can be sure that everything works now?

alters-mit commented 3 years ago

@meier-johannes94 Downgrade to magnebot 1.1.1, not 1.3.1

@abhi1092 I don't know if the Docker image is installing Magnebot and TDW, but if so, it shouldn't. Installing transport_challenge will automatically install the correct versions of Magnebot and TDW.

meier-johannes94 commented 3 years ago

@alters-mit Oh, you are right: 1.1.1 I will pull the image again and rebuild once @abhi1092 has updated it, then it should be downgraded automatically, correct?

alters-mit commented 3 years ago

@meier-johannes94 Is transport_challenge being installed inside the Docker container? If so, @abhi1092 needs to remove the instructions to install magnebot and tdw.

If you installed transport_challenge outside of the Docker container, do this to fix your setup:

  1. pip3 install tdw==1.8.7
  2. pip3 install magnebot==1.1.1
  3. pip3 install ikpy==3.1
meier-johannes94 commented 3 years ago

@alters-mit Outside the docker container it works fine :-) The problem occurs only when I use it with Docker.

abhi1092 commented 3 years ago

Yes, I checked the Dockerfile and it does install Magnebot and TDW separately. Let me change that and rebuild the image.

abhi1092 commented 3 years ago

I have pushed the new image. The magnebot version now should be the one transport challenge uses.

meier-johannes94 commented 3 years ago

@abhi1092, @alters-mit Thank you very much, this works now! However the performance might not be ideal. Would you check that? I have opened a new ticket https://github.com/chuangg/tdw-transport-challenge-starter-code/issues/9