Closed widebowl closed 3 years ago
Hmm this is strange and I'm not really sure what is happening here, it seems the actual worker that is supposed to play the game isn't doing that. Can you check if ray has multiple workers actually playing the game? There should be atleast two.
Hi. I tried running your code on Ubuntu Linux after installing the virtual machine. Take a look at the results below and let me know why. This is the result of running for about 5 hours.
Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -30914.91. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk...p: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -30914.91. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -31942.12. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk...p: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -31942.12. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -30468.08. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... Saving modelward: -30468.08. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -29278.16. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -29350.38. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk...p: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -29350.38. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -31926.97. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... Saving modelward: -31926.97. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -30892.09. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... Saving modelward: -30892.09. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -29666.60. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... Saving modelward: -29666.60. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -30646.84. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... Saving modelward: -30646.84. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -24786.48. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk...p: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -24786.48. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk...p: 0/50000000. Played games: 1. Loss: 0.00 (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -31290.79. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... Saving modelward: -31290.79. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -30271.14. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... Saving modelward: -30271.14. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -32123.04. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... Saving modelward: -32123.04. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -31048.90. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... Saving modelward: -31048.90. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -30950.96. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... Saving modelward: -30950.96. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -30795.34. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... Saving modelward: -30795.34. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -32585.52. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -30935.74. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... Saving modelward: -30935.74. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk...p: 0/50000000. Played games: 1. Loss: 0.00 (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -32087.31. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk...p: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -32087.31. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk...p: 0/50000000. Played games: 1. Loss: 0.00 (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -28482.38. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... Saving modelward: -28482.38. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... (pid=2276) Finished a game!. Training step: 0/50000000. Played games: 1. Loss: 0.00 Saving modelward: -31326.23. Training step: 0/50000000. Played games: 1. Loss: 0.00 Persisting replay buffer games to disk... ^Zst test reward: -31326.23. Training step: 0/50000000. Played games: 1. Loss: 0.00