toshikwa / fqf-iqn-qrdqn.pytorch

PyTorch implementation of FQF, IQN and QR-DQN.
MIT License
161 stars 24 forks source link

No performance in all three algorithms #5

Closed terryzhao127 closed 4 years ago

terryzhao127 commented 4 years ago

I use the following command to run three algorithms on Pong respectively, but returns are always around -20 (by replacing <algo> with fqf and so on).

python train_<algo>.py --cuda --env_id PongNoFrameskip-v4 --seed 0 --config config/<algo>.yaml

Is there anything wrong now at master branch (b4928f91d22c80eb7e42aa268da7f64de7491636)?

toshikwa commented 4 years ago

Hi, @guikarist

How long did you train algorithms??

Today, I trained FQF for less than three hours at master branch on PongNoFrameskip-v4. ( Orange: naive FQF, Blue: FQF with {multi_step: 3}. )

スクリーンショット 2020-06-15 16 39 12

It seems that they are learning well.

Actually, my implementation is focused on the paper's reproducibility and doesn't use techniques like n-step returns and Double Q by default. So, if you want to train algorithms faster, I recommend you to change config as below. (I think multi_step is the most effective.)

multi_step: 3
double_q_learning: True
dueling_net: True

If you are interested in more efficient algorithm (rather than good final performance), I recommend you to check out policy-based algorithms like PPO.

Please let me know if you still have problems. Anyway, thank you for asking :)

terryzhao127 commented 4 years ago

Thanks for your reply!

Obviously it is the time which made me wrong. I rerun the FQF experiment last night. Through more than 7 hours I got this result:

image

The curve is just like the first part of yours. However, it's a bit too slow. How did you run 5M steps in 3 hours? Was there any parameters modified?

My test environment has 40 CPU cores, 500G memory and 8 Titan V GPU.

toshikwa commented 4 years ago

Hi, @guikarist

Let me assure you that GPU is enabled. Could you check it like below??

import torch
print(torch.cuda.is_available())
a = torch.zeros(4)
a = a.cuda()
print(a.device)

If it doesn't use GPUs, please check your CUDA setup. If you're using other than CUDA 10.2, maybe you need to reinstall PyTorch for the proper version of CUDA. Please see instructions for more details.

Could you report me the result??

toshikwa commented 4 years ago

BTW, you have really good resources... I'm kinda jealous lol

terryzhao127 commented 4 years ago

These resources are shared among our lab, not belonging to me LOL.

Thanks for your advice, now I got similar results!

image

THSWind commented 4 years ago

BTW, you have really good resources... I'm kinda jealous lol

Me too, what a lucky dog!