takuseno / d3rlpy

An offline deep reinforcement learning library
https://takuseno.github.io/d3rlpy
MIT License
1.31k stars 238 forks source link

Examples not working #295

Closed AxelWimmer closed 1 year ago

AxelWimmer commented 1 year ago

I installed d3rlpy via pip on Windows and tried to execute the basic example from the tutorial. For example:

import torch
print(torch.cuda.is_available())
from d3rlpy.datasets import get_cartpole # CartPole-v0 dataset
from d3rlpy.datasets import get_pendulum # Pendulum-v0 dataset
#from d3rlpy.datasets import get_pybullet # PyBullet task datasets
from d3rlpy.datasets import get_atari    # Atari 2600 task datasets
from d3rlpy.datasets import get_d4rl     # D4RL datasets

dataset, env = get_cartpole()

from sklearn.model_selection import train_test_split

train_episodes, test_episodes = train_test_split(dataset, test_size=0.2)

from d3rlpy.algos import DQN

# if you don't use GPU, set use_gpu=False instead.
dqn = DQN(use_gpu=True)

# initialize neural networks with the given observation shape and action size.
# this is not necessary when you directly call fit or fit_online method.
dqn.build_with_dataset(dataset)

from d3rlpy.metrics.scorer import td_error_scorer
from d3rlpy.metrics.scorer import average_value_estimation_scorer

# calculate metrics with test dataset
td_error = td_error_scorer(dqn, test_episodes)

from d3rlpy.metrics.scorer import evaluate_on_environment

# set environment in scorer function
evaluate_scorer = evaluate_on_environment(env)

# evaluate algorithm on the environment
rewards = evaluate_scorer(dqn)

dqn.fit(train_episodes,
        eval_episodes=test_episodes,
        n_epochs=10,
        scorers={
            'td_error': td_error_scorer,
            'value_scale': average_value_estimation_scorer,
            'environment': evaluate_scorer
        })

However i always get the same error: Traceback (most recent call last): File "./online_rl.py", line 38, in dqn.fit(train_episodes, File "D:\MasterHiWi.hiwi_env\lib\site-packages\d3rlpy\base.py", line 406, in fit results = list( File "D:\MasterHiWi.hiwi_env\lib\site-packages\d3rlpy\base.py", line 665, in fitter self._evaluate(eval_episodes, scorers, logger) File "D:\MasterHiWi.hiwi_env\lib\site-packages\d3rlpy\base.py", line 802, in _evaluate test_score = scorer(self, episodes) File "D:\MasterHiWi.hiwi_env\lib\site-packages\d3rlpy\metrics\scorer.py", line 472, in scorer action = algo.predict([observation])[0] File "D:\MasterHiWi.hiwi_env\lib\site-packages\d3rlpy\algos\base.py", line 127, in predict return self._impl.predict_best_action(x) File "D:\MasterHiWi.hiwi_env\lib\site-packages\d3rlpy\torch_utility.py", line 305, in wrapper return f(self, *args, **kwargs) File "D:\MasterHiWi.hiwi_env\lib\site-packages\d3rlpy\torch_utility.py", line 246, in wrapper tensor = tensor.to(self.device) AttributeError: 'list' object has no attribute 'to'

The problem is caused by the different scorer metrics, but I don't understand why. It is supposed to work out of the box.

HateBunnyPlzzz commented 1 year ago

The incompatibility of the new gym version causes such issues on the current examples provided in the documentation, try downgrading the gym version using pip install gym==0.17.2. Let me know if it works.

takuseno commented 1 year ago

The major update v2.0.2 has been released. In this version, d3rlpy primary supports gym==2.6.0. Probably, this won't be a problem anymore. Feel free to reopen this issue if there is any further discussion.