This is quite a good work. I used pip install dev.. to install, but there are quite a lot problems.
(vlmrm) root@autodl-container-d33848b29e-3752a142:~/vlmrm/vlmrm# vlmrm train "$(cat config.yaml)"
/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "modelcheckpoint" has conflict with protected namespace "model".
You may be able to resolve this warning by setting model_config['protected_namespaces'] = ().
warnings.warn(
/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_basepath" has conflict with protected namespace "model".
You may be able to resolve this warning by setting model_config['protected_namespaces'] = ().
warnings.warn(
2024-08-02 19:13:27.076 | INFO | vlmrm.trainer.train:train:146 - Command called: /root/miniconda3/envs/vlmrm/bin/vlmrm train env_name: Humanoid-v4 # RL environment name
base_path: /data/runs/training # Base path to save logs and checkpoints
seed: 42 # Seed for reproducibility
description: Humanoid training using CLIP reward
tags: # Wandb tags
training
humanoid
CLIP
reward:
name: clip
pretrained_model: ViT-g-14/laion2b_s34b_b88k # CLIP model name
CLIP batch size per synchronous inference step.
Batch size must be divisible by n_workers (GPU count)
so that it can be shared among workers, and must be a divisor
of n_envs * episode_length so that all batches can be of the
same size (no support for variable batch size as of now.)
batch_size: 200 #1600
alpha: 0.5 # Alpha value of Baseline CLIP (CO-RELATE)
target_prompts: # Description of the goal state
a humanoid robot kneeling
baseline_prompts: # Description of the environment
a humanoid robot
Path to pre-saved model weights. When executing multiple runs,
mount a volume to this path to avoid downloading the model
weights multiple times.
cache_dir: /root/.cache
rl:
policy_name: MlpPolicy
n_steps: 100000 # Total number of simulation steps to be collected.
n_envs_per_worker: 2 # Number of environments per worker (GPU)
episode_length: 200 # Desired episode length
learning_starts: 100 # Number of env steps to collect before training
train_freq: 200 # Number of collected env steps between training iterations
batch_size: 64 # SAC buffer sample size per gradient step
gradient_steps: 1 # Number of samples to collect from the buffer per training step
tau: 0.005 # SAC target network update rate
gamma: 0.99 # SAC discount factor
learning_rate: 3e-4 # SAC optimizer learning rate
logging:
checkpoint_freq: 800 # Number of env steps between checkpoints
video_freq: 800 # Number of env steps between videos
tensorboard_freq: 800 # Number of env steps between tensorboard logs
2024-08-02 19:13:27.166 | INFO | vlmrm.trainer.train:train:159 - Started run with id Humanoid_CLIP_2024-08-02_19-13-27_d1f41a6e
2024-08-02 19:13:27.167 | INFO | vlmrm.trainer.train:train:162 - Logging experiment metadata
2024-08-02 19:13:27.168 | INFO | vlmrm.trainer.train:_train:170 - Running CLIP-rewarded SAC. Spawning workers.
/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "modelcheckpoint" has conflict with protected namespace "model".
You may be able to resolve this warning by setting model_config['protected_namespaces'] = ().
warnings.warn(
/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_basepath" has conflict with protected namespace "model".
You may be able to resolve this warning by setting model_config['protected_namespaces'] = ().
warnings.warn(
2024-08-02 19:13:30.281 | INFO | vlmrm.trainer.train:primary_worker:69 - Creating environment instance
/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "modelcheckpoint" has conflict with protected namespace "model".
You may be able to resolve this warning by setting model_config['protected_namespaces'] = ().
warnings.warn(
/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_basepath" has conflict with protected namespace "model".
You may be able to resolve this warning by setting model_config['protected_namespaces'] = ().
warnings.warn(
Process ForkServerProcess-1:2:
Traceback (most recent call last):
File "/root/miniconda3/envs/vlmrm/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/root/miniconda3/envs/vlmrm/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(self._args, self._kwargs)
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/stable_baselines3/common/vec_env/subproc_vec_env.py", line 29, in _worker
env = _patch_env(env_fn_wrapper.var())
File "/root/vlmrm/vlmrm/src/vlmrm/contrib/sb3/make_vec_env.py", line 66, in _init
env = make_env_fn(env_kwargs)
File "/root/vlmrm/vlmrm/src/vlmrm/envs/base.py", line 25, in make_env_wrapper
env = gymnasium.make(
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/gymnasium/envs/registration.py", line 755, in make
env_creator = load_env_creator(env_spec.entry_point)
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/gymnasium/envs/registration.py", line 553, in load_env_creator
mod = importlib.import_module(mod_name)
File "/root/miniconda3/envs/vlmrm/lib/python3.9/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1030, in _gcd_import
File "", line 1007, in _find_and_load
File "", line 986, in _find_and_load_unlocked
File "", line 680, in _load_unlocked
File "", line 790, in exec_module
File "", line 228, in _call_with_frames_removed
File "/root/vlmrm/vlmrm/src/vlmrm/envs/mujoco/clip_rewarded_humanoid.py", line 6, in
from gymnasium.envs.mujoco import MujocoEnv
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/gymnasium/envs/mujoco/init.py", line 1, in
from gymnasium.envs.mujoco.mujoco_env import MujocoEnv, MuJocoPyEnv # isort:skip
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/gymnasium/envs/mujoco/mujoco_env.py", line 19, in
import mujoco
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/mujoco/init.py", line 48, in
from mujoco.gl_context import
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/mujoco/gl_context.py", line 41, in
from mujoco.egl import GLContext as _GLContext
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/mujoco/egl/init.py", line 31, in
from mujoco.egl import egl_ext as EGL
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/mujoco/egl/egl_ext.py", line 27, in
from OpenGL import EGL
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/OpenGL/EGL/init.py", line 2, in
from OpenGL.raw.EGL._types import *
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/OpenGL/raw/EGL/_types.py", line 87, in
raw_eglQueryString = _p.PLATFORM.EGL.eglQueryString
AttributeError: 'NoneType' object has no attribute 'eglQueryString'
/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "modelcheckpoint" has conflict with protected namespace "model".
You may be able to resolve this warning by setting model_config['protected_namespaces'] = ().
warnings.warn(
/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_basepath" has conflict with protected namespace "model".
You may be able to resolve this warning by setting model_config['protected_namespaces'] = ().
warnings.warn(
Process ForkServerProcess-1:1:
Traceback (most recent call last):
File "/root/miniconda3/envs/vlmrm/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/root/miniconda3/envs/vlmrm/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(self._args, self._kwargs)
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/stable_baselines3/common/vec_env/subproc_vec_env.py", line 29, in _worker
env = _patch_env(env_fn_wrapper.var())
File "/root/vlmrm/vlmrm/src/vlmrm/contrib/sb3/make_vec_env.py", line 66, in _init
env = make_env_fn(env_kwargs)
File "/root/vlmrm/vlmrm/src/vlmrm/envs/base.py", line 25, in make_env_wrapper
env = gymnasium.make(
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/gymnasium/envs/registration.py", line 755, in make
env_creator = load_env_creator(env_spec.entry_point)
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/gymnasium/envs/registration.py", line 553, in load_env_creator
mod = importlib.import_module(mod_name)
File "/root/miniconda3/envs/vlmrm/lib/python3.9/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1030, in _gcd_import
File "", line 1007, in _find_and_load
File "", line 986, in _find_and_load_unlocked
File "", line 680, in _load_unlocked
File "", line 790, in exec_module
File "", line 228, in _call_with_frames_removed
File "/root/vlmrm/vlmrm/src/vlmrm/envs/mujoco/clip_rewarded_humanoid.py", line 6, in
from gymnasium.envs.mujoco import MujocoEnv
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/gymnasium/envs/mujoco/init.py", line 1, in
from gymnasium.envs.mujoco.mujoco_env import MujocoEnv, MuJocoPyEnv # isort:skip
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/gymnasium/envs/mujoco/mujoco_env.py", line 19, in
import mujoco
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/mujoco/init.py", line 48, in
from mujoco.gl_context import
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/mujoco/gl_context.py", line 41, in
from mujoco.egl import GLContext as _GLContext
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/mujoco/egl/init.py", line 31, in
from mujoco.egl import egl_ext as EGL
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/mujoco/egl/egl_ext.py", line 27, in
from OpenGL import EGL
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/OpenGL/EGL/init.py", line 2, in
from OpenGL.raw.EGL._types import *
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/OpenGL/raw/EGL/_types.py", line 87, in
raw_eglQueryString = _p.PLATFORM.EGL.eglQueryString
AttributeError: 'NoneType' object has no attribute 'eglQueryString'
2024-08-02 19:13:34.172 | ERROR | vlmrm.trainer.train:train:184 - An error has been caught in function 'train', process 'MainProcess' (21862), thread 'MainThread' (140350658221888):
Traceback (most recent call last):
File "/root/miniconda3/envs/vlmrm/bin/vlmrm", line 8, in
sys.exit(main())
│ │ └ <function main at 0x7fa5eef223a0>
│ └
└ <module 'sys' (built-in)>
File "/root/vlmrm/vlmrm/src/vlmrm/cli/main.py", line 12, in main
app()
└ <typer.main.Typer object at 0x7fa5eef13c10>
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/typer/main.py", line 311, in call
return get_command(self)(*args, kwargs)
│ │ │ └ {}
│ │ └ ()
│ └ <typer.main.Typer object at 0x7fa5eef13c10>
└ <function get_command at 0x7fa5ee57df70>
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/click/core.py", line 1157, in call
return self.main(args, kwargs)
│ │ │ └ {}
│ │ └ ()
│ └ <function TyperGroup.main at 0x7fa5ee572c10>
└
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/typer/core.py", line 778, in main
return _main(
└ <function _main at 0x7fa5ee56eee0>
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
│ │ └ <click.core.Context object at 0x7fa5e3320bb0>
│ └ <function MultiCommand.invoke at 0x7fa5eec660d0>
└
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
│ │ │ │ └ <click.core.Context object at 0x7fa5e32297c0>
│ │ │ └ <function Command.invoke at 0x7fa5eec61b80>
│ │ └
│ └ <click.core.Context object at 0x7fa5e32297c0>
└ <function MultiCommand.invoke.._process_result at 0x7fa5e3228310>
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, ctx.params)
│ │ │ │ │ └ {'config': 'env_name: Humanoid-v4 # RL environment name\nbase_path: /data/runs/training # Base path to save logs and checkpoi...
│ │ │ │ └ <click.core.Context object at 0x7fa5e32297c0>
│ │ │ └ <function train at 0x7fa5e3230670>
│ │ └
│ └ <function Context.invoke at 0x7fa5eec5d940>
└ <click.core.Context object at 0x7fa5e32297c0>
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/click/core.py", line 783, in invoke
return __callback(args, kwargs)
│ └ {'config': 'env_name: Humanoid-v4 # RL environment name\nbase_path: /data/runs/training # Base path to save logs and checkpoi...
└ ()
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
│ └ {'config': 'env_name: Humanoid-v4 # RL environment name\nbase_path: /data/runs/training # Base path to save logs and checkpoi...
└ <function train at 0x7fa5e32303a0>
File "/root/vlmrm/vlmrm/src/vlmrm/trainer/train.py", line 184, in train
_train()
└ <function train.._train at 0x7fa5e342f940>
File "/root/vlmrm/vlmrm/src/vlmrm/trainer/train.py", line 172, in _train
multiprocessing.spawn(
│ └ <function spawn at 0x7fa5e3315670>
└ <module 'vlmrm.multiprocessing' from '/root/vlmrm/vlmrm/src/vlmrm/multiprocessing.py'>
File "/root/vlmrm/vlmrm/src/vlmrm/multiprocessing.py", line 59, in start_processes
while not context.join():
│ └ <function ProcessContext.join at 0x7fa55372ce50>
└ <torch.multiprocessing.spawn.ProcessContext object at 0x7fa5e3421100>
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 160, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
│ │ │ │ └ <property object at 0x7fa55376ea90>
│ │ │ └
│ │ └ 0
│ └ '\n\n-- Process 0 terminated with the following error:\nTraceback (most recent call last):\n File "/root/vlmrm/vlmrm/src/vlm...
└ <class 'torch.multiprocessing.spawn.ProcessRaisedException'>
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/root/vlmrm/vlmrm/src/vlmrm/multiprocessing.py", line 17, in _wrap
fn(i, stop_event, *args)
File "/root/vlmrm/vlmrm/src/vlmrm/trainer/train.py", line 202, in init_process
primary_worker(config, config_dump, stop_event)
File "/root/vlmrm/vlmrm/src/vlmrm/trainer/train.py", line 70, in primary_worker
vec_env = make_vec_env(
File "/root/vlmrm/vlmrm/src/vlmrm/contrib/sb3/make_vec_env.py", line 100, in make_vec_env
vec_env = vec_env_cls(make_env_fns, *vec_env_kwargs)
File "/root/vlmrm/vlmrm/src/vlmrm/contrib/sb3/subproc_vec_env.py", line 11, in init
super().init(args, **kwargs)
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/stable_baselines3/common/vec_env/subproc_vec_env.py", line 120, in init
observation_space, action_space = self.remotes[0].recv()
File "/root/miniconda3/envs/vlmrm/lib/python3.9/multiprocessing/connection.py", line 255, in recv
buf = self._recv_bytes()
File "/root/miniconda3/envs/vlmrm/lib/python3.9/multiprocessing/connection.py", line 419, in _recv_bytes
buf = self._recv(4)
File "/root/miniconda3/envs/vlmrm/lib/python3.9/multiprocessing/connection.py", line 384, in _recv
chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer
This is quite a good work. I used pip install dev.. to install, but there are quite a lot problems.
(vlmrm) root@autodl-container-d33848b29e-3752a142:~/vlmrm/vlmrm# vlmrm train "$(cat config.yaml)" /root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "modelcheckpoint" has conflict with protected namespace "model".
You may be able to resolve this warning by setting
model_config['protected_namespaces'] = ()
. warnings.warn( /root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_basepath" has conflict with protected namespace "model".You may be able to resolve this warning by setting
model_config['protected_namespaces'] = ()
. warnings.warn( 2024-08-02 19:13:27.076 | INFO | vlmrm.trainer.train:train:146 - Command called: /root/miniconda3/envs/vlmrm/bin/vlmrm train env_name: Humanoid-v4 # RL environment name base_path: /data/runs/training # Base path to save logs and checkpoints seed: 42 # Seed for reproducibility description: Humanoid training using CLIP reward tags: # Wandb tagsCLIP batch size per synchronous inference step.
Batch size must be divisible by n_workers (GPU count)
so that it can be shared among workers, and must be a divisor
of n_envs * episode_length so that all batches can be of the
same size (no support for variable batch size as of now.)
batch_size: 200 #1600 alpha: 0.5 # Alpha value of Baseline CLIP (CO-RELATE) target_prompts: # Description of the goal state
Path to pre-saved model weights. When executing multiple runs,
mount a volume to this path to avoid downloading the model
weights multiple times.
cache_dir: /root/.cache rl: policy_name: MlpPolicy n_steps: 100000 # Total number of simulation steps to be collected. n_envs_per_worker: 2 # Number of environments per worker (GPU) episode_length: 200 # Desired episode length learning_starts: 100 # Number of env steps to collect before training train_freq: 200 # Number of collected env steps between training iterations batch_size: 64 # SAC buffer sample size per gradient step gradient_steps: 1 # Number of samples to collect from the buffer per training step tau: 0.005 # SAC target network update rate gamma: 0.99 # SAC discount factor learning_rate: 3e-4 # SAC optimizer learning rate logging: checkpoint_freq: 800 # Number of env steps between checkpoints video_freq: 800 # Number of env steps between videos
tensorboard_freq: 800 # Number of env steps between tensorboard logs
2024-08-02 19:13:27.166 | INFO | vlmrm.trainer.train:train:159 - Started run with id Humanoid_CLIP_2024-08-02_19-13-27_d1f41a6e 2024-08-02 19:13:27.167 | INFO | vlmrm.trainer.train:train:162 - Logging experiment metadata 2024-08-02 19:13:27.168 | INFO | vlmrm.trainer.train:_train:170 - Running CLIP-rewarded SAC. Spawning workers. /root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "modelcheckpoint" has conflict with protected namespace "model".
You may be able to resolve this warning by setting
model_config['protected_namespaces'] = ()
. warnings.warn( /root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_basepath" has conflict with protected namespace "model".You may be able to resolve this warning by setting
model_config['protected_namespaces'] = ()
. warnings.warn( 2024-08-02 19:13:30.281 | INFO | vlmrm.trainer.train:primary_worker:69 - Creating environment instance /root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "modelcheckpoint" has conflict with protected namespace "model".You may be able to resolve this warning by setting
model_config['protected_namespaces'] = ()
. warnings.warn( /root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_basepath" has conflict with protected namespace "model".You may be able to resolve this warning by setting", line 1030, in _gcd_import
File "", line 1007, in _find_and_load
File "", line 986, in _find_and_load_unlocked
File "", line 680, in _load_unlocked
File "", line 790, in exec_module
File "", line 228, in _call_with_frames_removed
File "/root/vlmrm/vlmrm/src/vlmrm/envs/mujoco/clip_rewarded_humanoid.py", line 6, in
from gymnasium.envs.mujoco import MujocoEnv
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/gymnasium/envs/mujoco/init.py", line 1, in
from gymnasium.envs.mujoco.mujoco_env import MujocoEnv, MuJocoPyEnv # isort:skip
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/gymnasium/envs/mujoco/mujoco_env.py", line 19, in
import mujoco
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/mujoco/init.py", line 48, in
from mujoco.gl_context import
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/mujoco/gl_context.py", line 41, in
from mujoco.egl import GLContext as _GLContext
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/mujoco/egl/init.py", line 31, in
from mujoco.egl import egl_ext as EGL
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/mujoco/egl/egl_ext.py", line 27, in
from OpenGL import EGL
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/OpenGL/EGL/init.py", line 2, in
from OpenGL.raw.EGL._types import *
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/OpenGL/raw/EGL/_types.py", line 87, in
raw_eglQueryString = _p.PLATFORM.EGL.eglQueryString
AttributeError: 'NoneType' object has no attribute 'eglQueryString'
/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "modelcheckpoint" has conflict with protected namespace "model".
model_config['protected_namespaces'] = ()
. warnings.warn( Process ForkServerProcess-1:2: Traceback (most recent call last): File "/root/miniconda3/envs/vlmrm/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/root/miniconda3/envs/vlmrm/lib/python3.9/multiprocessing/process.py", line 108, in run self._target(self._args, self._kwargs) File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/stable_baselines3/common/vec_env/subproc_vec_env.py", line 29, in _worker env = _patch_env(env_fn_wrapper.var()) File "/root/vlmrm/vlmrm/src/vlmrm/contrib/sb3/make_vec_env.py", line 66, in _init env = make_env_fn(env_kwargs) File "/root/vlmrm/vlmrm/src/vlmrm/envs/base.py", line 25, in make_env_wrapper env = gymnasium.make( File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/gymnasium/envs/registration.py", line 755, in make env_creator = load_env_creator(env_spec.entry_point) File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/gymnasium/envs/registration.py", line 553, in load_env_creator mod = importlib.import_module(mod_name) File "/root/miniconda3/envs/vlmrm/lib/python3.9/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "You may be able to resolve this warning by setting
model_config['protected_namespaces'] = ()
. warnings.warn( /root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_basepath" has conflict with protected namespace "model".You may be able to resolve this warning by setting", line 1030, in _gcd_import
File "", line 1007, in _find_and_load
File "", line 986, in _find_and_load_unlocked
File "", line 680, in _load_unlocked
File "", line 790, in exec_module
File "", line 228, in _call_with_frames_removed
File "/root/vlmrm/vlmrm/src/vlmrm/envs/mujoco/clip_rewarded_humanoid.py", line 6, in
from gymnasium.envs.mujoco import MujocoEnv
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/gymnasium/envs/mujoco/init.py", line 1, in
from gymnasium.envs.mujoco.mujoco_env import MujocoEnv, MuJocoPyEnv # isort:skip
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/gymnasium/envs/mujoco/mujoco_env.py", line 19, in
import mujoco
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/mujoco/init.py", line 48, in
from mujoco.gl_context import
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/mujoco/gl_context.py", line 41, in
from mujoco.egl import GLContext as _GLContext
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/mujoco/egl/init.py", line 31, in
from mujoco.egl import egl_ext as EGL
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/mujoco/egl/egl_ext.py", line 27, in
from OpenGL import EGL
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/OpenGL/EGL/init.py", line 2, in
from OpenGL.raw.EGL._types import *
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/OpenGL/raw/EGL/_types.py", line 87, in
raw_eglQueryString = _p.PLATFORM.EGL.eglQueryString
AttributeError: 'NoneType' object has no attribute 'eglQueryString'
2024-08-02 19:13:34.172 | ERROR | vlmrm.trainer.train:train:184 - An error has been caught in function 'train', process 'MainProcess' (21862), thread 'MainThread' (140350658221888):
Traceback (most recent call last):
model_config['protected_namespaces'] = ()
. warnings.warn( Process ForkServerProcess-1:1: Traceback (most recent call last): File "/root/miniconda3/envs/vlmrm/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/root/miniconda3/envs/vlmrm/lib/python3.9/multiprocessing/process.py", line 108, in run self._target(self._args, self._kwargs) File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/stable_baselines3/common/vec_env/subproc_vec_env.py", line 29, in _worker env = _patch_env(env_fn_wrapper.var()) File "/root/vlmrm/vlmrm/src/vlmrm/contrib/sb3/make_vec_env.py", line 66, in _init env = make_env_fn(env_kwargs) File "/root/vlmrm/vlmrm/src/vlmrm/envs/base.py", line 25, in make_env_wrapper env = gymnasium.make( File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/gymnasium/envs/registration.py", line 755, in make env_creator = load_env_creator(env_spec.entry_point) File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/gymnasium/envs/registration.py", line 553, in load_env_creator mod = importlib.import_module(mod_name) File "/root/miniconda3/envs/vlmrm/lib/python3.9/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "File "/root/miniconda3/envs/vlmrm/bin/vlmrm", line 8, in
sys.exit(main())
│ │ └ <function main at 0x7fa5eef223a0>
│ └
└ <module 'sys' (built-in)>
File "/root/vlmrm/vlmrm/src/vlmrm/cli/main.py", line 12, in main app() └ <typer.main.Typer object at 0x7fa5eef13c10>
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/typer/main.py", line 311, in call return get_command(self)(*args, kwargs) │ │ │ └ {} │ │ └ () │ └ <typer.main.Typer object at 0x7fa5eef13c10> └ <function get_command at 0x7fa5ee57df70> File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/click/core.py", line 1157, in call return self.main(args, kwargs) │ │ │ └ {} │ │ └ () │ └ <function TyperGroup.main at 0x7fa5ee572c10> └
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/typer/core.py", line 778, in main
return _main(
└ <function _main at 0x7fa5ee56eee0>
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
│ │ └ <click.core.Context object at 0x7fa5e3320bb0>
│ └ <function MultiCommand.invoke at 0x7fa5eec660d0>
└
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
│ │ │ │ └ <click.core.Context object at 0x7fa5e32297c0>
│ │ │ └ <function Command.invoke at 0x7fa5eec61b80>
│ │ └
│ └ <click.core.Context object at 0x7fa5e32297c0>
└ <function MultiCommand.invoke.._process_result at 0x7fa5e3228310>
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, ctx.params)
│ │ │ │ │ └ {'config': 'env_name: Humanoid-v4 # RL environment name\nbase_path: /data/runs/training # Base path to save logs and checkpoi...
│ │ │ │ └ <click.core.Context object at 0x7fa5e32297c0>
│ │ │ └ <function train at 0x7fa5e3230670>
│ │ └
│ └ <function Context.invoke at 0x7fa5eec5d940>
└ <click.core.Context object at 0x7fa5e32297c0>
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/click/core.py", line 783, in invoke
return __callback( args, kwargs)
│ └ {'config': 'env_name: Humanoid-v4 # RL environment name\nbase_path: /data/runs/training # Base path to save logs and checkpoi...
└ ()
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
│ └ {'config': 'env_name: Humanoid-v4 # RL environment name\nbase_path: /data/runs/training # Base path to save logs and checkpoi...
└ <function train at 0x7fa5e32303a0>
File "/root/vlmrm/vlmrm/src/vlmrm/trainer/train.py", line 172, in _train multiprocessing.spawn( │ └ <function spawn at 0x7fa5e3315670> └ <module 'vlmrm.multiprocessing' from '/root/vlmrm/vlmrm/src/vlmrm/multiprocessing.py'>
File "/root/vlmrm/vlmrm/src/vlmrm/multiprocessing.py", line 103, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method="spawn") │ │ │ │ │ └ False │ │ │ │ └ True │ │ │ └ 1 │ │ └ ('nccl', Config(env_name='Humanoid-v4', base_path=PosixPath('/data/runs/training'), seed=42, description='Humanoid training u... │ └ <function init_process at 0x7fa5e3230430> └ <function start_processes at 0x7fa5e33155e0>
File "/root/vlmrm/vlmrm/src/vlmrm/multiprocessing.py", line 59, in start_processes while not context.join(): │ └ <function ProcessContext.join at 0x7fa55372ce50> └ <torch.multiprocessing.spawn.ProcessContext object at 0x7fa5e3421100>
File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 160, in join raise ProcessRaisedException(msg, error_index, failed_process.pid) │ │ │ │ └ <property object at 0x7fa55376ea90> │ │ │ └
│ │ └ 0
│ └ '\n\n-- Process 0 terminated with the following error:\nTraceback (most recent call last):\n File "/root/vlmrm/vlmrm/src/vlm...
└ <class 'torch.multiprocessing.spawn.ProcessRaisedException'>
torch.multiprocessing.spawn.ProcessRaisedException:
-- Process 0 terminated with the following error: Traceback (most recent call last): File "/root/vlmrm/vlmrm/src/vlmrm/multiprocessing.py", line 17, in _wrap fn(i, stop_event, *args) File "/root/vlmrm/vlmrm/src/vlmrm/trainer/train.py", line 202, in init_process primary_worker(config, config_dump, stop_event) File "/root/vlmrm/vlmrm/src/vlmrm/trainer/train.py", line 70, in primary_worker vec_env = make_vec_env( File "/root/vlmrm/vlmrm/src/vlmrm/contrib/sb3/make_vec_env.py", line 100, in make_vec_env vec_env = vec_env_cls(make_env_fns, *vec_env_kwargs) File "/root/vlmrm/vlmrm/src/vlmrm/contrib/sb3/subproc_vec_env.py", line 11, in init super().init(args, **kwargs) File "/root/miniconda3/envs/vlmrm/lib/python3.9/site-packages/stable_baselines3/common/vec_env/subproc_vec_env.py", line 120, in init observation_space, action_space = self.remotes[0].recv() File "/root/miniconda3/envs/vlmrm/lib/python3.9/multiprocessing/connection.py", line 255, in recv buf = self._recv_bytes() File "/root/miniconda3/envs/vlmrm/lib/python3.9/multiprocessing/connection.py", line 419, in _recv_bytes buf = self._recv(4) File "/root/miniconda3/envs/vlmrm/lib/python3.9/multiprocessing/connection.py", line 384, in _recv chunk = read(handle, remaining) ConnectionResetError: [Errno 104] Connection reset by peer