waffoo / accel

accelerate reinforcement learning
MIT License
1 stars 1 forks source link

Correct initial priority #9

Closed waffoo closed 4 years ago

waffoo commented 4 years ago

Initial priority should not be a max priority over all past td_errors, but all current td_errors.

https://github.com/waffoo/accel/blob/bf7f975729dc04e4ee2b0766ad12fa839698f0f6/accel/agents/dqn.py#L45-L49

https://github.com/waffoo/accel/blob/bf7f975729dc04e4ee2b0766ad12fa839698f0f6/accel/replay_buffers/prioritized_replay_buffer.py#L24-L27

waffoo commented 4 years ago

It's no problem. However, the initial value of max_err is too small. It was set to 1.0 in the paper.