kzl / decision-transformer

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.
MIT License
2.39k stars 449 forks source link

Error when loading fixed replay buffer #34

Open dgjung0220 opened 2 years ago

dgjung0220 commented 2 years ago

Hi, Thank you for the code contribution.

I tried the code with these command python atari/run_dt_atari.py --seed 123 --epochs 5 --model_type 'reward_conditioned' --num_steps 500000 --num_buffers 50 --game 'Breakout' --batch_size 64

But it turned out like this (always 0 loaded transitions~) : image

I think this problem is caused by not creating frb(fixed replay buffer) properly in "fixed_replay_buffer.py". This part: image

When declare circular_replay_buffer in the dopamine library and load the buffer, it's not created properly, so fixed replay buffer always return None.

Is it a bug or am I doing something wrong in this setting?

ttt496 commented 2 years ago

"self._data_dir" in "fixed_replay_buffer.py" is seemed to be different from what the authors assumed, becuase of the process in "create_dataset.py" line 37. I think this probrem should be resolved changing "create_dataset.py" line 37, data_dir=data_dir_prefix + game + '/1/replay_logs', to data_dir=data_dir_prefix + f'{game}/1/replay_logs',

T0M0F commented 2 years ago

I had the same problem. I forgot to append a slash to the data_dir_prefix argument, e.g. --data_dir_prefix ./data/. If you omit the slash the script tries to load ./dataBreakout/1/replay_logs instead of ./data/Breakout/1/replay_logs.

lxqpku commented 2 years ago

"self._data_dir" in "fixed_replay_buffer.py" is seemed to be different from what the authors assumed, becuase of the process in "create_dataset.py" line 37. I think this probrem should be resolved changing "create_dataset.py" line 37, data_dir=data_dir_prefix + game + '/1/replay_logs', to data_dir=data_dir_prefix + f'{game}/1/replay_logs',

This does not work as well, and I use the argument --data_dir_prefix ./dqn_replay/ still goes wrong. Has anyone solved this issue?

haoruili699 commented 1 year ago

I used the absolute path in --data_dir_prefix and this error no longer occurs.

Hust1Booze commented 1 year ago

Breakout but not ‘Breakout’ ; D:/Code/dataset/ but not 'D:/Code/dataset/ ' ; dont add ''
this is work for me. Try to debug code you can see this problem

AsadMir10 commented 1 year ago

The above error was solved but i got these warnings, this buffer has 13576 loaded transitions and there are now 168767 transitions total divided into 120 trajectories loading from buffer 47 which has 0 already loaded WARNING:absl:Unable to find episode_end_indices. This is expected for old checkpoints.

theaimer09 commented 1 year ago

The above error was solved but i got these warnings, this buffer has 13576 loaded transitions and there are now 168767 transitions total divided into 120 trajectories loading from buffer 47 which has 0 already loaded WARNING:absl:Unable to find episode_end_indices. This is expected for old checkpoints.

Hi. I also got these warnings, are u already solved it?

AsadMir10 commented 1 year ago

I don’t think this is a issue, its just because of old checkpoints, i was able to run epochs on a gpu and got max rtg and timestamps, so i think you are good to go.

On Wed, Aug 30, 2023 at 1:21 PM Aimer @.***> wrote:

The above error was solved but i got these warnings, this buffer has 13576 loaded transitions and there are now 168767 transitions total divided into 120 trajectories loading from buffer 47 which has 0 already loaded WARNING:absl:Unable to find episode_end_indices. This is expected for old checkpoints.

Hi. I also got these warnings, are u already solved it?

— Reply to this email directly, view it on GitHub https://github.com/kzl/decision-transformer/issues/34#issuecomment-1698712690, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVJW4VTWOCZWZE5AF2WSPRDXX3ZXPANCNFSM5NW2XGCQ . You are receiving this because you commented.Message ID: @.***>

theaimer09 commented 1 year ago

I don’t think this is a issue, its just because of old checkpoints, i was able to run epochs on a gpu and got max rtg and timestamps, so i think you are good to go. On Wed, Aug 30, 2023 at 1:21 PM Aimer @.> wrote: The above error was solved but i got these warnings, this buffer has 13576 loaded transitions and there are now 168767 transitions total divided into 120 trajectories loading from buffer 47 which has 0 already loaded WARNING:absl:Unable to find episode_end_indices. This is expected for old checkpoints. Hi. I also got these warnings, are u already solved it? — Reply to this email directly, view it on GitHub <#34 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVJW4VTWOCZWZE5AF2WSPRDXX3ZXPANCNFSM5NW2XGCQ . You are receiving this because you commented.Message ID: @.>

Thanks for your reply, you are right.

pwyq commented 1 year ago

The error comes from the misusage of path.

In create_dataset.py, change

data_dir= data_dir_prefix + game + '/1/replay_logs',

to

import os
...
data_dir = os.path.join(data_dir_prefix, game, '1/replay_logs'),
lukerabbitte commented 11 months ago

Has anyone tried a workaround that doesn't use the circular_replay_buffer import from dopamine.replay_memory?

Reason: The cloning of the Dopamine library at the end of the conda_env.yml file as of December 2023 is causing a lot of dependency issues. It seems dopamine has a dependency on jaxlib which is not compatible with Python 3.7.9 as used in the paper (Error: No matching distribution found for jaxlib>=0.1.51).

kjthedj commented 2 months ago

has anyone found a workaround yet, just re-pinging as I too am struggling with this issue of dopamine rl and jaxlib being incompatible for python 3.7.9