Closed emigmo closed 1 year ago
Even the training process is still going on, though the log file [wandb]!
Interesting, we haven't seen this error before and I cannot reproduce it on my system. Can you run the following on your host system to provide some more info?
cat /proc/version
)nvidia-smi --query-gpu=index,driver_version,name --format=csv,noheader
)OS version:
Linux version 3.10.0-957.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC) ) #1 SMP Thu Nov 8 23:39:32 UTC 2018
GPU and driver info:
0, 470.129.06, NVIDIA A100-SXM4-80GB
1, 470.129.06, NVIDIA A100-SXM4-80GB
2, 470.129.06, NVIDIA A100-SXM4-80GB
3, 470.129.06, NVIDIA A100-SXM4-80GB
4, 470.129.06, NVIDIA A100-SXM4-80GB
5, 470.129.06, NVIDIA A100-SXM4-80GB
6, 470.129.06, NVIDIA A100-SXM4-80GB
7, 470.129.06, NVIDIA A100-SXM4-80GB
Hi @emigmo - could you also post the exact package versions you have installed (pip freeze
if using pip)? My suspicion is that you're running a more recent version of Shapely that we haven't tested - try installing Shapely==1.7.0
.
@mx781 the exact package versions:
root@3b7013b5381a:/opt/projects/avalon# pip freeze
absl-py==1.4.0
appdirs==1.4.4
arch==5.3.0
asttokens==2.2.1
attrs==22.2.0
# Editable install with no version control (avalon-rl==1.0.1)
-e /opt/projects/avalon
backcall==0.2.0
boto3==1.26.59
botocore==1.29.59
certifi==2022.12.7
chardet==5.1.0
charset-normalizer==3.0.1
click==8.1.3
cloudpickle==2.2.1
colorlog==6.7.0
contourpy==1.0.7
cycler==0.11.0
decorator==4.4.2
dill==0.3.6
dm-tree==0.1.8
docker-pycreds==0.4.0
einops==0.6.0
executing==1.2.0
fire==0.5.0
fonttools==4.38.0
gitdb==4.0.10
GitPython==3.1.30
godot-parser==0.1.6
gym==0.25.2
gym-notices==0.0.8
idna==3.4
imageio==2.25.0
imageio-ffmpeg==0.4.8
importlib-metadata==6.0.0
ipython==8.9.0
jedi==0.18.2
jmespath==1.0.1
jsonschema==4.17.3
kiwisolver==1.4.4
llvmlite==0.39.1
loguru==0.6.0
lxml==4.9.2
mapbox-earcut==1.0.1
matplotlib==3.6.3
matplotlib-inline==0.1.6
moviepy==1.0.3
mpmath==1.2.1
msgpack==1.0.4
networkx==3.0
nptyping==2.4.1
numba==0.56.4
numpy==1.23.5
openturns==1.20.post3
packaging==23.0
pandas==1.5.3
parso==0.8.3
pathtools==0.1.2
patsy==0.5.3
pexpect==4.8.0
pickleshare==0.7.5
Pillow==9.4.0
pkg_resources==0.0.0
proglog==0.1.10
prompt-toolkit==3.0.36
property-cached==1.6.4
protobuf==4.21.12
psutil==5.9.4
ptyprocess==0.7.0
pure-eval==0.2.2
pycollada==0.7.2
pyglet==1.5.27
Pygments==2.14.0
pyparsing==3.0.9
pyrsistent==0.19.3
python-dateutil==2.8.2
pytz==2022.7.1
PyWavelets==1.4.1
PyYAML==6.0
requests==2.28.2
rliable==1.0.8
Rtree==1.0.1
s3transfer==0.6.0
scikit-image==0.19.3
scipy==1.9.3
seaborn==0.12.2
sentry-sdk==1.14.0
setproctitle==1.3.2
sh==1.14.3
shapely==2.0.0
six==1.16.0
smmap==5.0.0
stack-data==0.6.2
statsmodels==0.13.5
svg.path==6.2
sympy==1.11.1
termcolor==2.2.0
tifffile==2023.1.23.1
torch==1.12.0+cu113
torchvision==0.13.0+cu113
tqdm==4.64.1
traitlets==5.8.1
trimesh==3.18.1
typing_extensions==4.4.0
urllib3==1.26.14
wandb==0.13.9
wcwidth==0.2.6
xxhash==3.2.0
zipp==3.12.0
You specify the Shapely
package in requirements_frozen.txt as 1.8.5.post1
.
but pip install -e .
will automatically install the package shapely-2.0.1
I can not specify the package version throught pip install shapely==1.7.0
root@3b7013b5381a:/opt/projects/avalon# pip install shapely==1.7.0
Collecting shapely==1.7.0
Using cached Shapely-1.7.0.tar.gz (349 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [12 lines of output]
Failed `CDLL(libgeos_c.so.1)`
Failed `CDLL(libgeos_c.so)`
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/tmp/pip-install-p0h6i8co/shapely_e7a64f68f3154ab598aa6561becc2c6c/setup.py", line 85, in <module>
from shapely._buildcfg import geos_version_string, geos_version, \
File "/tmp/pip-install-p0h6i8co/shapely_e7a64f68f3154ab598aa6561becc2c6c/shapely/_buildcfg.py", line 169, in <module>
lgeos = load_dll('geos_c',
File "/tmp/pip-install-p0h6i8co/shapely_e7a64f68f3154ab598aa6561becc2c6c/shapely/_buildcfg.py", line 162, in load_dll
raise OSError(
OSError: Could not find library geos_c or load any of its variants ['libgeos_c.so.1', 'libgeos_c.so']
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
Yup, 1.7.0 and 1.8.5.post1 should both be tested and work, while shapely 2.0.0 came out recently and we haven't tested it - we will probably need to restrict the allowed versions for the package. Does installing and running with 1.8.5.post1 work?
For the libgeos issue, should just need sudo apt-get install libgeos-dev
on Debian or
yum -y install epel-release
rpm -Uvh geos geos-devel
on Red Hat.
I'm able to reproduce the issue if I install shapely==2.0.0. But it runs with no problem if I install shapely==2.0.1.
It looks like this issue in the shapely library caused the error, which they fixed in 2.0.1.
Closing this issue because upgrading shapely to 2.0.1 appears to fix it, please reopen if the issue persists.
When running the test code
python -m avalon.agent.train_ppo_avalon
In Docker container , the error info as: