facebookresearch / torchbeast

A PyTorch Platform for Distributed RL
Apache License 2.0
738 stars 114 forks source link

Minimum parameter configuration for not bad training results #32

Closed MXD6 closed 3 years ago

MXD6 commented 3 years ago

Hello author: I set the parameters as follows, but my training results are not good. --env PongNoFrameskip-v4 --mode train --num_actors 4 --total_steps 100000 --use_lstm test results:

[INFO:17945 monobeast:600 2021-08-29 22:14:16,250] Episode ended after 761 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:17,356] Episode ended after 759 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:18,460] Episode ended after 757 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:19,565] Episode ended after 755 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:20,668] Episode ended after 757 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:21,769] Episode ended after 755 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:22,869] Episode ended after 756 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:23,978] Episode ended after 761 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:25,076] Episode ended after 755 steps. Return: -21.0
[INFO:17945 monobeast:600 2021-08-29 22:14:26,183] Episode ended after 761 steps. Return: -21.0
[INFO:17945 monobeast:604 2021-08-29 22:14:26,183] Average returns over 10 steps: -21.0

Please, I want to know what is the minimum parameter configuration to get a not bad training results. Thank you!

heiner commented 3 years ago

Hey MaXiaodong,

I see you have closed this issue again. Did you manage to resolve the underlying problem?

To be completely honest, it's been a while that I last ran monobeast.py, but I believe it is capable of learning Pong (the environment it runs on by default) with its default set of hyperparameters.