OpenRL-Lab / openrl

Unified Reinforcement Learning Framework
https://openrl-docs.readthedocs.io
Apache License 2.0
621 stars 60 forks source link

[Bug]: Got an unexpected keyword argument 'cfg' #248

Closed Error0229 closed 10 months ago

Error0229 commented 10 months ago

🐛 Bug

Having problem while running Atari example code on both stable and main branch It seems due to the cfg didn't correctly pass to gymnasium make

To Reproduce

python train_ppo.py --config atari_ppo.yaml  

Relevant log output / Error message

test@... ~/c/o/e/atari (stable) [1]> python train_ppo.py --config atari_ppo.yaml                                             (retro) 
Traceback (most recent call last):
  File "/home/test/miniconda3/envs/retro/lib/python3.10/site-packages/gymnasium/envs/registration.py", line 802, in make
    env = env_creator(**env_spec_kwargs)
TypeError: AtariEnv.__init__() got an unexpected keyword argument 'cfg'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/test/code/openrl/examples/atari/train_ppo.py", line 101, in <module>
    agent = train()
  File "/home/test/code/openrl/examples/atari/train_ppo.py", line 49, in train
    env = make(
  File "/home/test/miniconda3/envs/retro/lib/python3.10/site-packages/openrl/envs/common/registration.py", line 165, in make
    env = AsyncVectorEnv(env_fns, render_mode=render_mode, auto_reset=auto_reset)
  File "/home/test/miniconda3/envs/retro/lib/python3.10/site-packages/openrl/envs/vec_env/async_venv.py", line 96, in __init__
    dummy_env = env_fns[0]()
  File "/home/test/miniconda3/envs/retro/lib/python3.10/site-packages/openrl/envs/common/build_envs.py", line 36, in _make_env
    env = make(
  File "/home/test/miniconda3/envs/retro/lib/python3.10/site-packages/gymnasium/envs/registration.py", line 814, in make
    raise type(e)(
TypeError: AtariEnv.__init__() got an unexpected keyword argument 'cfg' was raised from the environment creator for ALE/Pong-v5 with kwargs ({'game': 'pong', 'obs_type': 'rgb', 'repeat_action_probability': 0.25, 'full_action_space': False, 'frameskip': 4, 'max_num_frames_per_episode': 108000, 'render_mode': None, 'cfg': Namespace(config=[Path_fr(/tmp/tmp4xkjxkro.yaml)], seed=0, encode_state=False, n_block=1, n_embd=64, n_head=1, dec_actor=False, share_actor=False, callbacks=None, sb3_model_path=None, sb3_algo=None, step_difference=1, gail=False, expert_data=None, gail_batch_size=128, dis_input_len=None, gail_loss_target=None, gail_epoch=5, gail_use_action=True, gail_hidden_size=256, gail_layer_num=3, gail_lr=0.0005, data_dir=None, force_rewrite=False, collector_num=1, input_data_dir=None, output_data_dir=None, worker_num=1, sample_interval=1, selfplay_api=Namespace(host='127.0.0.1', port=10086), lazy_load_opponent=True, self_play=False, selfplay_algo='WeightExistEnemy', max_play_num=2000, max_enemy_num=-1, exist_enemy_num=0, random_pos=-1, build_in_pos=-1, use_amp=False, load_optimizer=False, use_joint_action_loss=False, frameskip=None, eval_render=False, terminal='current_terminal', distributed_type='sync', program_type='local', share_temp_dir=None, share_entry_script_path=None, learner_num=1, fetch_num=1, tmux_prefix=None, kill_all=False, namespace='default', mount_path=None, mount_name=None, persistent_volume_claim_name=None, disable_training=False, use_half_actor=False, algorithm_name='ppo', experiment_name='atari_ppo', gpu_usage_type='auto', disable_cuda=False, cuda_deterministic=True, pytorch_threads=1, n_rollout_threads=32, n_eval_rollout_threads=1, n_render_rollout_threads=1, num_env_steps=10000000, user_name='openrl', wandb_entity='openrl-lab', disable_wandb=False, env_name='StarCraft2', scenario_name='default', num_agents=1, num_enemies=1, use_obs_instead_of_state=False, episode_length=128, eval_episode_length=200, max_episode_length=None, separate_policy=False, use_conv1d=False, stacked_frames=1, use_stacked_frames=False, hidden_size=512, layer_N=1, activation_id=1, use_popart=False, dual_clip_ppo=False, dual_clip_coeff=3, use_valuenorm=True, use_feature_normalization=False, use_orthogonal=True, gain=0.01, cnn_layers_params=None, use_maxpool2d=False, rnn_type='gru', rnn_num=1, use_naive_recurrent_policy=False, use_recurrent_policy=False, recurrent_N=1, data_chunk_length=2, use_influence_policy=False, influence_layer_N=1, use_attn=False, attn_N=1, attn_size=64, attn_heads=4, dropout=0.0, use_average_pool=True, use_attn_internal=True, use_cat_self=True, lr=0.00025, tau=0.995, critic_lr=0.00025, opti_eps=1e-05, weight_decay=0, bc_epoch=2, ppo_epoch=4, use_policy_vhead=False, use_clipped_value_loss=True, clip_param=0.1, num_mini_batch=4, mini_batch_size=None, policy_value_loss_coef=0.5, entropy_coef=0.01, value_loss_coef=0.5, use_max_grad_norm=True, max_grad_norm=10.0, use_gae=True, gamma=0.99, gae_lambda=0.95, use_proper_time_limits=False, use_huber_loss=True, use_value_active_masks=True, use_policy_active_masks=True, huber_delta=10.0, use_adv_normalize=True, aux_epoch=5, clone_coef=1.0, use_single_network=False, use_linear_lr_decay=True, save_interval=1, only_eval=False, log_interval=1, log_each_episode=True, use_rich_handler=True, use_eval=False, eval_interval=25, eval_episodes=32, save_gifs=False, use_render=False, render_episodes=5, ifi=0.1, model_dir=None, save_dir=None, init_dir=None, run_dir='./run_results/', use_transmit=False, server_address=None, use_tlaunch=False, actor_num=1, use_reward_normalization=False, buffer_size=5000, popart_update_interval_step=2, use_per=False, per_alpha=0.6, per_beta_start=0.4, per_eps=1e-06, per_nu=0.9, batch_size=32, actor_train_interval_step=2, train_interval_episode=1, train_interval=100, use_same_critic_obs=True, use_global_all_local_state=False, prev_act_inp=False, target_update=10, var=0.5, actor_lr=0.001, auto_alph=False, alpha_value=0.2, alpha_lr=0.0002, use_soft_update=True, hard_update_interval_episode=200, num_random_episodes=5, epsilon_start=1.0, epsilon_finish=0.05, epsilon_anneal_time=5000, use_double_q=True, hypernet_layers=2, mixer_hidden_dim=32, hypernet_hidden_dim=64, target_action_noise_std=0.2, data_path=None, env=Namespace(args={}), model_path=None, use_share_model=True, reward_class=Namespace(id=None, args={}), vec_info_class=Namespace(id='EPS_RewardInfo', args={}), eval_metrics=[], disable_update_enemy=False, least_win_rate=0.5, recent_list_max_len=100, latest_weight=0.5, newest_pos=1, newest_weight=0.5)})

System Info

Checklist

huangshiyu13 commented 10 months ago

fixed in #249