google-deepmind / dm_control

Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.
Apache License 2.0
3.82k stars 669 forks source link

_glfwPlatformGetTls: Assertion `tls->posix.allocated == 1' failed. #285

Open DexiongYung opened 2 years ago

DexiongYung commented 2 years ago

Error: _glfwPlatformGetTls: Assertiontls->posix.allocated == 1' failed.`

I'm running Mujuco2.0 with the following pip dependencies:

libglewmx-dev
libglew-dev
libglew2.0
libglewmx1.13
(7000) dyung6@ripl-w3:~/rad$ apt-cache pkgnames libglfw
libglfw3-wayland
libglfw3
libglfw3-dev
libglfw3-doc
(7000) dyung6@ripl-w3:~/rad$ conda activate rad
(rad) dyung6@ripl-w3:~/rad$ pip freeze
absl-py @ file:///opt/conda/conda-bld/absl-py_1639803114343/work
cached-property==1.5.2
cachetools==4.2.4
certifi==2021.5.30
cffi @ file:///tmp/build/80754af9/cffi_1625814693874/work
charset-normalizer==2.0.12
cloudpickle==2.0.0
cycler==0.11.0
dataclasses==0.8
decorator==4.4.2
dm-control==0.0.364896371
dm-env==1.5
dm-tree==0.1.6
dmc2gym @ git+https://github.com/1nadequacy/dmc2gym.git@06f7e335d988b17145947be9f6a76f557d0efe81
future==0.18.2
glfw==2.5.2
google-auth==2.6.3
google-auth-oauthlib==0.4.6
grpcio==1.44.0
gym==0.21.0
h5py==3.1.0
idna==3.3
imageio==2.15.0
imageio-ffmpeg==0.4.5
importlib-metadata==4.8.3
importlib-resources==5.4.0
kiwisolver==1.3.1
labmaze==1.0.5
lxml==4.8.0
Markdown==3.3.6
matplotlib==3.3.4
mkl-fft==1.3.0
mkl-random==1.1.1
mkl-service==2.3.0
networkx==2.5.1
numpy @ file:///tmp/build/80754af9/numpy_and_numpy_base_1603487797006/work
oauthlib==3.2.0
olefile==0.46
Pillow==8.4.0
protobuf==3.19.4
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
PyOpenGL==3.1.6
pyparsing @ file:///tmp/build/80754af9/pyparsing_1635766073266/work
python-dateutil==2.8.2
PyWavelets==1.1.1
requests==2.27.1
requests-oauthlib==1.3.1
rsa==4.8
scikit-image==0.17.2
scipy==1.5.4
six @ file:///tmp/build/80754af9/six_1644875935023/work
tabulate==0.8.9
tb-nightly==2.9.0a20220408
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
termcolor==1.1.0
tifffile==2020.9.3
torch==1.10.1
torchvision==0.11.2
tqdm==4.64.0
typing_extensions==4.1.1
urllib3==1.26.9
Werkzeug==2.0.3
zipp==3.6.0
(rad) dyung6@ripl-w3:~/rad$ python --version
Python 3.6.13 :: Anaconda, Inc.
(rad) dyung6@ripl-w3:~/rad$ conda activate 7000
(7000) dyung6@ripl-w3:~/rad$ pip freeze
absl-py==1.0.0
certifi==2021.10.8
charset-normalizer==2.0.12
cloudpickle==2.0.0
dm-control==1.0.1
dm-env==1.5
dm-tree==0.1.6
dmc2gym @ git+https://github.com/denisyarats/dmc2gym.git@06f7e335d988b17145947be9f6a76f557d0efe81
future==0.18.2
glfw==2.5.3
gym==0.23.1
gym-notices==0.0.6
h5py==3.6.0
idna==3.3
importlib-metadata==4.11.3
labmaze==1.0.5
lxml==4.8.0
mujoco==2.1.4
numpy==1.22.3
protobuf==3.20.0
PyOpenGL==3.1.6
pyparsing==2.4.7
requests==2.27.1
scipy==1.8.0
six==1.16.0
tqdm==4.64.0
urllib3==1.26.9
zipp==3.8.0

Here is my code:

from dm_control import suite
import numpy as np

random_state = np.random.RandomState(42)
env = suite.load('hopper', 'stand', task_kwargs={'random': random_state})

# Simulate episode with random actions
duration = 4  # Seconds
frames = []
ticks = []
rewards = []
observations = []

spec = env.action_spec()
time_step = env.reset()

while env.physics.data.time < duration:

  action = random_state.uniform(spec.minimum, spec.maximum, spec.shape)
  time_step = env.step(action)
  obs = time_step.observation
  camera0 = env.physics.render(camera_id=0, height=200, width=200)
  pixels = env.physics.render(height=200, width=200, camera_id=0)
  print('')

I have libglew2.0 and libglfw3 installed and I'm running on Linux with Ubuntu 18.04.5. Python version is 3.9.12. The issue arises when I try to render. Not sure why this is happening.

saran-t commented 2 years ago

"TLS" above refers to thread-local storage, and it looks like for some reason GLFW is failing to allocate TLS memory.

I've never seen this happen before so please take what I say here with a grain of salt. From what you posted, the biggest culprit to me is libglfw3-wayland. The copy of libglew.so that was shipped with MuJoCo 2.0 doesn't support Wayland, only GLX (i.e. OpenGL on X11). However, you also have libglfw3 installed, which uses GLX.

As a random stab in the dark, I would suggest that you try to uninstall libglfw3-wayland to see if this is caused by GLFW using the wrong backend.

DexiongYung commented 2 years ago

"TLS" above refers to thread-local storage, and it looks like for some reason GLFW is failing to allocate TLS memory.

I've never seen this happen before so please take what I say here with a grain of salt. From what you posted, the biggest culprit to me is libglfw3-wayland. The copy of libglew.so that was shipped with MuJoCo 2.0 doesn't support Wayland, only GLX (i.e. OpenGL on X11). However, you also have libglfw3 installed, which uses GLX.

As a random stab in the dark, I would suggest that you try to uninstall libglfw3-wayland to see if this is caused by GLFW using the wrong backend.

hmm I checked to see if I had wayland via dpkg -s libgfw3-wayland and it says it's not installed.

saran-t commented 2 years ago

I can't really be of much more help here I'm afraid, since I cannot reproduce this issue on my end.

Can you please check if the issue still persists with mujoco==2.1.5 that was just released today?

DexiongYung commented 2 years ago

xvfb-run -a -s "-screen 0 1400x900x24" bash

this fixed it

fuyw commented 1 year ago

export MUJOCO_GL=egl