Avalon-Benchmark / avalon

A 3D video game environment and benchmark designed from scratch for reinforcement learning research
https://generallyintelligent.com/avalon/
GNU General Public License v3.0
175 stars 16 forks source link

[Error] Inconsistent coordinate dimensionality #20

Closed emigmo closed 1 year ago

emigmo commented 1 year ago

When running the test code python -m avalon.agent.train_ppo_avalon In Docker container , the error info as:

  ValueError: Inconsistent coordinate dimensionality
  Caught error in _on_world_generation_error. This shouldn't happen!!!:
  Inconsistent coordinate dimensionality
  Unspecified error in world generation, this was try 0... (reason: Inconsistent coordinate dimensionality)
  Unspecified error in world generation, this was try 0... (reason: Inconsistent coordinate dimensionality)
  Unspecified error in world generation, this was try 0... (reason: Inconsistent coordinate dimensionality)
  Unspecified error in world generation, this was try 0... (reason: Inconsistent coordinate dimensionality)
  Unspecified error in world generation, this was try 0... (reason: Inconsistent coordinate dimensionality)
  Unspecified error in world generation, this was try 0... (reason: Inconsistent coordinate dimensionality)
  z^H^HUnspecified error in world generation, this was try 0... (reason: Inconsistent coordinate dimensionality)
  Unspecified error in world generation, this was try 1... (reason: Inconsistent coordinate dimensionality)
  Unspecified error in world generation, this was try 2... (reason: Inconsistent coordinate dimensionality)
  Unspecified error in world generation, this was try 3... (reason: Inconsistent coordinate dimensionality)
  Ran out of retries to generate a good world!
  Params were: GenerateAvalonWorldParams(task=<AvalonTask.OPEN: 'OPEN'>, difficulty=0.0, seed=8, index=32, output='/tmp/science/data/level_gen/6b796894-c198-41ba-a553-dbca7358de88/32', num_retries=5)
  An error has been caught in function 'worker', process 'ForkPoolWorker-11:1' (2815), thread 'MainThread' (140485987227456):
emigmo commented 1 year ago

Even the training process is still going on, though the log file [wandb]!

bai-generally-intelligent commented 1 year ago

Interesting, we haven't seen this error before and I cannot reproduce it on my system. Can you run the following on your host system to provide some more info?

emigmo commented 1 year ago

OS version:

Linux version 3.10.0-957.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC) ) #1 SMP Thu Nov 8 23:39:32 UTC 2018

GPU and driver info:

0, 470.129.06, NVIDIA A100-SXM4-80GB
1, 470.129.06, NVIDIA A100-SXM4-80GB
2, 470.129.06, NVIDIA A100-SXM4-80GB
3, 470.129.06, NVIDIA A100-SXM4-80GB
4, 470.129.06, NVIDIA A100-SXM4-80GB
5, 470.129.06, NVIDIA A100-SXM4-80GB
6, 470.129.06, NVIDIA A100-SXM4-80GB
7, 470.129.06, NVIDIA A100-SXM4-80GB
mx781 commented 1 year ago

Hi @emigmo - could you also post the exact package versions you have installed (pip freeze if using pip)? My suspicion is that you're running a more recent version of Shapely that we haven't tested - try installing Shapely==1.7.0.

emigmo commented 1 year ago

@mx781 the exact package versions:

root@3b7013b5381a:/opt/projects/avalon# pip freeze
absl-py==1.4.0
appdirs==1.4.4
arch==5.3.0
asttokens==2.2.1
attrs==22.2.0
# Editable install with no version control (avalon-rl==1.0.1)
-e /opt/projects/avalon
backcall==0.2.0
boto3==1.26.59
botocore==1.29.59
certifi==2022.12.7
chardet==5.1.0
charset-normalizer==3.0.1
click==8.1.3
cloudpickle==2.2.1
colorlog==6.7.0
contourpy==1.0.7
cycler==0.11.0
decorator==4.4.2
dill==0.3.6
dm-tree==0.1.8
docker-pycreds==0.4.0
einops==0.6.0
executing==1.2.0
fire==0.5.0
fonttools==4.38.0
gitdb==4.0.10
GitPython==3.1.30
godot-parser==0.1.6
gym==0.25.2
gym-notices==0.0.8
idna==3.4
imageio==2.25.0
imageio-ffmpeg==0.4.8
importlib-metadata==6.0.0
ipython==8.9.0
jedi==0.18.2
jmespath==1.0.1
jsonschema==4.17.3
kiwisolver==1.4.4
llvmlite==0.39.1
loguru==0.6.0
lxml==4.9.2
mapbox-earcut==1.0.1
matplotlib==3.6.3
matplotlib-inline==0.1.6
moviepy==1.0.3
mpmath==1.2.1
msgpack==1.0.4
networkx==3.0
nptyping==2.4.1
numba==0.56.4
numpy==1.23.5
openturns==1.20.post3
packaging==23.0
pandas==1.5.3
parso==0.8.3
pathtools==0.1.2
patsy==0.5.3
pexpect==4.8.0
pickleshare==0.7.5
Pillow==9.4.0
pkg_resources==0.0.0
proglog==0.1.10
prompt-toolkit==3.0.36
property-cached==1.6.4
protobuf==4.21.12
psutil==5.9.4
ptyprocess==0.7.0
pure-eval==0.2.2
pycollada==0.7.2
pyglet==1.5.27
Pygments==2.14.0
pyparsing==3.0.9
pyrsistent==0.19.3
python-dateutil==2.8.2
pytz==2022.7.1
PyWavelets==1.4.1
PyYAML==6.0
requests==2.28.2
rliable==1.0.8
Rtree==1.0.1
s3transfer==0.6.0
scikit-image==0.19.3
scipy==1.9.3
seaborn==0.12.2
sentry-sdk==1.14.0
setproctitle==1.3.2
sh==1.14.3
shapely==2.0.0
six==1.16.0
smmap==5.0.0
stack-data==0.6.2
statsmodels==0.13.5
svg.path==6.2
sympy==1.11.1
termcolor==2.2.0
tifffile==2023.1.23.1
torch==1.12.0+cu113
torchvision==0.13.0+cu113
tqdm==4.64.1
traitlets==5.8.1
trimesh==3.18.1
typing_extensions==4.4.0
urllib3==1.26.14
wandb==0.13.9
wcwidth==0.2.6
xxhash==3.2.0
zipp==3.12.0
emigmo commented 1 year ago

You specify the Shapely package in requirements_frozen.txt as 1.8.5.post1.

but pip install -e . will automatically install the package shapely-2.0.1

I can not specify the package version throught pip install shapely==1.7.0

root@3b7013b5381a:/opt/projects/avalon# pip install shapely==1.7.0
Collecting shapely==1.7.0
  Using cached Shapely-1.7.0.tar.gz (349 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [12 lines of output]
      Failed `CDLL(libgeos_c.so.1)`
      Failed `CDLL(libgeos_c.so)`
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-p0h6i8co/shapely_e7a64f68f3154ab598aa6561becc2c6c/setup.py", line 85, in <module>
          from shapely._buildcfg import geos_version_string, geos_version, \
        File "/tmp/pip-install-p0h6i8co/shapely_e7a64f68f3154ab598aa6561becc2c6c/shapely/_buildcfg.py", line 169, in <module>
          lgeos = load_dll('geos_c',
        File "/tmp/pip-install-p0h6i8co/shapely_e7a64f68f3154ab598aa6561becc2c6c/shapely/_buildcfg.py", line 162, in load_dll
          raise OSError(
      OSError: Could not find library geos_c or load any of its variants ['libgeos_c.so.1', 'libgeos_c.so']
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
mx781 commented 1 year ago

Yup, 1.7.0 and 1.8.5.post1 should both be tested and work, while shapely 2.0.0 came out recently and we haven't tested it - we will probably need to restrict the allowed versions for the package. Does installing and running with 1.8.5.post1 work?

For the libgeos issue, should just need sudo apt-get install libgeos-dev on Debian or

yum -y install epel-release
rpm -Uvh geos geos-devel

on Red Hat.

bai-generally-intelligent commented 1 year ago

I'm able to reproduce the issue if I install shapely==2.0.0. But it runs with no problem if I install shapely==2.0.1.

It looks like this issue in the shapely library caused the error, which they fixed in 2.0.1.

bai-generally-intelligent commented 1 year ago

Closing this issue because upgrading shapely to 2.0.1 appears to fix it, please reopen if the issue persists.