Closed ghost closed 3 years ago
Hi, you may find this answer useful.
My computer cannot create num_actors 32, so I had to use 16. Also I am using Windows 10, so I made these changes in your train.py at line 731: ctx = mp.get_context("spawn") #'spawn' mp.get_context("fork") free_queue = ctx.Queue() #ctx.SimpleQueue() full_queue = ctx.Queue() #ctx.SimpleQueue()
That might be the reason for my errors.
Could you provide a simpler example such as CartPole or Pendulum? Thanks!
I have the same issue
@cubicgate yes you might be running into errors at queuing in the actors or dequeuing in the learner. Remember that even with num_actors=1 there are 3 processes running, 1 actor, 1 learner and the main thread. Exceptions in non main threads don't stall the main thread here, so you may want to use prints at appropriate places to verify if you aren't running into an exception in any of the actors/learners. The stats wont be shown until one of your buffers are loaded with trajectory and the learner consumes it. So it might even be worth waiting for some time (depending on your execution platform) before you expect any output.
ctx = mp.get_context("spawn") # ctx = mp.get_context("fork")
resolved my problems , im on Ubuntu
My computer cannot create num_actors 32, so I had to use 16. Also I am using Windows 10, so I made these changes in your train.py at line 731: ctx = mp.get_context("spawn") #'spawn' mp.get_context("fork") free_queue = ctx.Queue() #ctx.SimpleQueue() full_queue = ctx.Queue() #ctx.SimpleQueue()
That might be the reason for my errors.
im not sure you will be able to run gym atari or dmlab on windows without issues.
I have a same problem with Python 3.6 version. After changing to 3.7 version, issue is disappeared. It seems like the TorchBeast is only available in Python 3.7 version.
Hello, I tried your command but with 16 num_ actors: python train.py --total_steps 10000000 --learning_rate 0.0004 --unroll_length 239 --num_buffers 40 --n_layer 3 --d_inner 1024 --xpid row82 --chunk_size 80 --action_repeat 1 --num_actors 32 --num_learner_threads 1 --sleep_length 5 --atari True
But I got: Steps 0 @ 0.0 SPS. Loss inf. Stats