Open nullonesix opened 3 months ago
im just wondering how to resolve the issue in a way that's not hacky since it clearly reveals some misunderstanding i have about the process of interfacing ezv2 with a custom env
Thank you for the question. I test the cartpole-v1 in gym0.22.0. I met the same problem. But I see the following figure in OpenAI Gym github page. So I guess you can upgrade the gym if you wanna test gym benchmark.
then i get:
(ezv2) swarms@dpm10:~/EfficientZeroV2$ python ez/train.py exp_config=ez/config/exp/cartpole.yaml
Traceback (most recent call last):
File "ez/train.py", line 25, in <module>
from ez import agents
File "/home/swarms/EfficientZeroV2/ez/agents/__init__.py", line 6, in <module>
from ez.agents.ez_atari import EZAtariAgent
File "/home/swarms/EfficientZeroV2/ez/agents/ez_atari.py", line 13, in <module>
from ez.envs import make_atari
File "/home/swarms/EfficientZeroV2/ez/envs/__init__.py", line 3, in <module>
from gym.wrappers import Monitor
ImportError: cannot import name 'Monitor' from 'gym.wrappers' (/home/swarms/miniconda3/envs/ezv2/lib/python3.8/site-packages/gym/wrappers/__init__.py)
thank you for being so helpful
Oh, I see. The monitor class is removed in the newest gym repo. If you don't record the videos, you can remove the monitor wrapper firstly.
ok, so after upgrading gym to latest version, replacing monitor with recordvideo, and disabling seeding, i run training and i get:
(pid=4076730) (array([-0.00774786, 0.02113981, 0.00405227, 0.04962296], dtype=float32), {})
(pid=4076730) [-0.01504798 -0.04475251 0.02837704 0.02546087]
(pid=4076730) (array([-0.01801576, -0.04822684, 0.02919559, -0.00987596], dtype=float32), {})
(pid=4076699) (array([-0.01454708, 0.03767176, -0.04898228, 0.00636753], dtype=float32), {})
(pid=4076699) [ 0.03428657 0.01023199 0.01853956 -0.01512355]
(pid=4076699) (array([-0.00603121, -0.0295258 , -0.01322525, 0.01265372], dtype=float32), {})
where the print statement in question is:
from ..base import BaseWrapper
class GymWrapper(BaseWrapper):
"""
Make your own wrapper: Atari Wrapper
"""
def __init__(self, env, obs_to_string=False):
super().__init__(env, obs_to_string, False)
def step(self, action):
obs, reward, _, done, info = self.env.step(action)
info['raw_reward'] = reward
return obs, reward, done, info
def reset(self,):
print(self.env.reset())
obs, info = self.env.reset()
return obs
so, sometimes it is a pair, and sometimes it is not ?
It is very weird. Did you notice the same problem using the example of gym like the following figure I sent?
It is impressive that I manage to get such weird behavior with so few changes 🤣 . Do you mean this image https://github.com/Shengjiewang-Jason/EfficientZeroV2/issues/5#issuecomment-2292634909 ? I didn't try it as I was unsure exactly where it went. I finally read the ezv2 paper and now am going through your code base so hopefully my understanding will improve.
Yeah, right. You can try it to test whether the basic env works. Ok, bro. Also you can look through the codebase. The problem may occur at some env wrappers in envs
folder. You can pay more attention on the wrappers. If you still meet some problems, you can send them to me.
trying to hook it up to cartpole..
but it expects:
ie there's no info component the full code where the error is thrown:
the error itself is: