Closed indweller closed 1 year ago
@indweller Thanks for reporting this. This was because get_atari
relies on another repository d4rl-atari, which I just fixed this issue there. Also, I've updated d3rlpy to support the latest render interface at these commits: https://github.com/takuseno/d3rlpy/commit/a7207cd24455ae399c2f4217bb328940ab86b0e1 https://github.com/takuseno/d3rlpy/commit/2121edd853ea32a9fe6d262de639c69c13b50c24 . I'll release a patch that includes these fixes later today. When you try this, please reinstall d4rl pip install -U git+https://github.com/takuseno/d4rl-atari
.
The latest patch has been released. https://github.com/takuseno/d3rlpy/releases/tag/v2.0.4
Just as a reference, you can enable rendering like this:
from d3rlpy.datasets import get_atari
from d3rlpy.algos import DQNConfig
from d3rlpy.metrics import TDErrorEvaluator, EnvironmentEvaluator
dataset, env = get_atari(env_name='pong-expert-v4', render_mode="human")
dqn = DQNConfig().create(device='cuda:0')
dqn.build_with_dataset(dataset)
td_error_evaluator = TDErrorEvaluator(episodes=dataset.episodes)
env_evaluator = EnvironmentEvaluator(env)
rewards = env_evaluator(dqn, dataset=None)
Thanks for your quick response @takuseno ! The code snippet that you shared seems to be working fine. But when I use it like shown below, I get the following error.
import d3rlpy, d4rl_atari
import gym
import numpy as np
dqn = d3rlpy.load_learnable('./pongdqn.d3')
env = gym.make('pong-expert-v4', render_mode='human')
observations = env.reset()
observations = observations[0]
terminated = False
truncated = False
total = 0
positive_reward = 0
while not terminated and not truncated:
action = dqn.predict(observations.reshape((1,1,84,84)))[0]
observations, reward, terminated, truncated, info = env.step(action)
env.render()
print(f"Reward: {reward}\n")
total += reward
if reward > 0:
positive_reward += reward
print(f"Total Reward: {total}, Positive Reward: {positive_reward}")
env.close()
Error:
2023-07-23 18:50:34 [warning ] There might be incompatibility because of version mismatch. current_version=2.0.4 saved_version=2.0.3
/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/envs/registration.py:623: UserWarning: WARN: The environment is being initialised with mode (human) that is not in the possible render_modes ([]).
logger.warn(
A.L.E: Arcade Learning Environment (version 0.8.1+53f58b7)
[Powered by Stella]
/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:31: UserWarning: WARN: A Box observation space has an unconventional shape (neither an image, nor a 1D vector). We recommend flattening the observation to have only a 1D vector or use a custom policy to properly process the data. Actual observation shape: (84, 84)
logger.warn(
/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:174: UserWarning: WARN: Future gym versions will require that `Env.reset` can be passed a `seed` instead of using `Env.seed` for resetting the environment random number generator.
logger.warn(
/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:187: UserWarning: WARN: Future gym versions will require that `Env.reset` can be passed `options` to allow the environment initialisation to be passed additional information.
logger.warn(
/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:233: DeprecationWarning: `np.bool8` is a deprecated alias for `np.bool_`. (Deprecated NumPy 1.24)
if not isinstance(terminated, (bool, np.bool8)):
Traceback (most recent call last):
File "testing.py", line 25, in <module>
env.render()
File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/core.py", line 329, in render
return self.env.render(*args, **kwargs)
File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/wrappers/order_enforcing.py", line 51, in render
return self.env.render(*args, **kwargs)
File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/wrappers/env_checker.py", line 53, in render
return env_render_passive_checker(self.env, *args, **kwargs)
File "/home/prashanth/projects/sandbox/lib/python3.8/site-packages/gym/utils/passive_env_checker.py", line 307, in env_render_passive_checker
assert (
AssertionError: With no render_modes, expects the Env.render_mode to be None, actual value: human
I tried to render the environment for Atari Pong. But I keep running into the following error. Code:
The output is as follows:
Additionally, I also tried to train without rendering, load the trained model separately and then render it during evaluation with
gym.make('pong-expert-v4', render_mode='human')
andenv.render()
. But the same error appears. I didn't face any issues while renderingCartPole