DLR-RM / rl-baselines3-zoo

A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
https://rl-baselines3-zoo.readthedocs.io
MIT License
2.01k stars 510 forks source link

[Bug]: enjoy panda policy in hugging face #408

Closed zhixiongzh closed 11 months ago

zhixiongzh commented 11 months ago

🐛 Bug

I try to run the provided trained panda_gym policy in the hugging face. I think there is a version conflict so that I got the error

root@BF4-C-008T7:/workspaces/rl_theft# python -m rl_zoo3.enjoy --algo tqc --env PandaPush-v1  -f logs/
Loading latest experiment, id=1
Loading logs/tqc/PandaPush-v1_1/PandaPush-v1.zip
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/rl_zoo3/exp_manager.py", line 518, in entry_point
    return str(gym.envs.registry[env_id].entry_point)  # pytype: disable=module-attr
KeyError: 'PandaPush-v1'

To Reproduce

pip install rl_zoo3
python -m rl_zoo3.load_from_hub --algo tqc --env PandaPush-v1 -orga sb3 -f logs/
pip install panda-gym
python -m rl_zoo3.enjoy --algo tqc --env PandaPush-v1  -f logs/

Relevant log output / Error message

root@BF4-C-008T7:/workspaces/rl_theft# python -m rl_zoo3.enjoy --algo tqc --env PandaPush-v1  -f logs/
Loading latest experiment, id=1
Loading logs/tqc/PandaPush-v1_1/PandaPush-v1.zip
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/rl_zoo3/exp_manager.py", line 518, in entry_point
    return str(gym.envs.registry[env_id].entry_point)  # pytype: disable=module-attr
KeyError: 'PandaPush-v1'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/conda/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/conda/lib/python3.10/site-packages/rl_zoo3/enjoy.py", line 279, in <module>
    enjoy()
  File "/opt/conda/lib/python3.10/site-packages/rl_zoo3/enjoy.py", line 139, in enjoy
    is_atari = ExperimentManager.is_atari(env_name.gym_id)
  File "/opt/conda/lib/python3.10/site-packages/rl_zoo3/exp_manager.py", line 524, in is_atari
    return "AtariEnv" in ExperimentManager.entry_point(env_id)
  File "/opt/conda/lib/python3.10/site-packages/rl_zoo3/exp_manager.py", line 520, in entry_point
    return str(gym26.envs.registry[env_id].entry_point)  # pytype: disable=module-attr
KeyError: 'PandaPush-v1'

System Info

qgallouedec commented 11 months ago

Hi, you need to downgrade to panda-gym v1

zhixiongzh commented 11 months ago

Hi, you need to downgrade to panda-gym v1

@qgallouedec Thanks for the quick response, can I get the correct install steps and all other versions? Because when I dowgrade by pip install panda-gym==1.0.0 I got a new error when I enjoy

AttributeError: module 'gym' has no attribute 'GoalEnv'

Then I also downgrade the gym by pip install gym==0.21.0 But I still get new error

root@BF4-C-008T7:/workspaces/rl_theft# python -m rl_zoo3.enjoy --algo tqc --env PandaPush-v1  -f logs/
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/runpy.py", line 187, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/opt/conda/lib/python3.10/runpy.py", line 110, in _get_module_details
    __import__(pkg_name)
  File "/opt/conda/lib/python3.10/site-packages/rl_zoo3/__init__.py", line 6, in <module>
    import rl_zoo3.gym_patches  # noqa: F401
  File "/opt/conda/lib/python3.10/site-packages/rl_zoo3/gym_patches.py", line 82, in <module>
    patched_registry.update(gym.envs.registration.registry)
TypeError: 'EnvRegistry' object is not iterable
qgallouedec commented 11 months ago

Sure, make sure you use python3.9 or below and

pip install stable-baselines3==1.5.1a8 panda_gym==1.1.1
zhixiongzh commented 11 months ago

Sure, make sure you use python3.9 or below and

pip install stable-baselines3==1.5.1a8 panda_gym==1.1.1

Hi @qgallouedec , it still does not work. how about the version of zoo3 and gym?

here is the step to reprocude the error now(same as before)

pip install  rl_zoo3 stable-baselines3==1.5.1a8 panda_gym==1.1.1
python -m rl_zoo3.enjoy --algo tqc --env PandaPush-v1  -f logs/ 

and here is my current version

OS: Linux-5.10.16.3-microsoft-standard-WSL2-x86_64-with-glibc2.17 #1 SMP Fri Apr 2 22:23:49 UTC 2021
Python: 3.8.13
Stable-Baselines3: 1.5.1a8
PyTorch: 2.1.0+cu121
GPU Enabled: True
Numpy: 1.22.3
Gym: 0.21.0
Panda Gym: 1.1.1

I got the same error as before TypeError: 'EnvRegistry' object is not iterable

Traceback (most recent call last):
  File "/home/myuser/.conda/envs/si/lib/python3.8/runpy.py", line 185, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/home/myuser/.conda/envs/si/lib/python3.8/runpy.py", line 111, in _get_module_details
    __import__(pkg_name)
  File "/home/myuser/.conda/envs/si/lib/python3.8/site-packages/rl_zoo3/__init__.py", line 6, in <module>
    import rl_zoo3.gym_patches  # noqa: F401
  File "/home/myuser/.conda/envs/si/lib/python3.8/site-packages/rl_zoo3/gym_patches.py", line 82, in <module>
    patched_registry.update(gym.envs.registration.registry)
TypeError: 'EnvRegistry' object is not iterable

I think it is the problem of zoo3 version, as I am already run it in a new conda environment

zhixiongzh commented 11 months ago

BTW, i try the lowest version of zoo3 in pip install(1.6.2), it is also not compatible with stable-baselines3==1.5.1a8

qgallouedec commented 11 months ago
pip install panda-gym==1.1.1 rl-zoo3==1.6.2 sb3-contrib==1.6.2 stable-baselines3==1.6.2
python -m rl_zoo3.enjoy --algo tqc --env PandaPush-v1

If gives

Episode Reward: -8.00
Episode Length 50
[...]
Episode Reward: -5.00
Episode Length 50
Success rate: 100.00%
20 Episodes
Mean reward: -6.10 +/- 1.87
Mean episode length: 50.00 +/- 0.00
zhixiongzh commented 11 months ago
pip install panda-gym==1.1.1 rl-zoo3==1.6.2 sb3-contrib==1.6.2 stable-baselines3==1.6.2
python -m rl_zoo3.enjoy --algo tqc --env PandaPush-v1

If gives

Episode Reward: -8.00
Episode Length 50
[...]
Episode Reward: -5.00
Episode Length 50
Success rate: 100.00%
20 Episodes
Mean reward: -6.10 +/- 1.87
Mean episode length: 50.00 +/- 0.00

Great! It works now with sb3-contrib==1.6.2 which I didn't correctly install.