keiohta / tf2rl

TensorFlow2 Reinforcement Learning
MIT License
461 stars 104 forks source link

Imporove apex #122

Open ymd-h opened 3 years ago

ymd-h commented 3 years ago

This PR is for #117

This improvement have larger effect for small network and/or simple Env.

I tested by running example/run_apex_dqn.py with default "CartPole-v0" on CPU machine. Please test other Envs and/or on GPU.

P.S. Weights distribution with multiple queues seems to be inefficient because of multiple copying. I will continue to consider other solution.

keiohta commented 3 years ago

Hi @ymd-h , thanks for brilliant PR!! I would really appreciate your continuous support!

I checked changes of codes, and I think all of them would contribute improvement of ApeX performance.

P.S. Weights distribution with multiple queues seems to be inefficient because of multiple copying. I will continue to consider other solution.

Yeah, this is true. I also consider some workaround. Thanks!

ymd-h commented 2 years ago

@keiohta

I updated the PR.

On my local machine, (a part of) logs at example/run_apex_dqn.py are followings;

Before PR

SS 2022-02-13 11 33 24

After PR

SS 2022-02-13 11 35 09

It seems that Super Linter v3 is strangely broken and we need to upgrade to v4. (I will add soon.) https://github.com/github/super-linter/issues/2253