training only prints out: "Agent 0" + "Samples 0"

Hello,

thank you for sharing your work!

Running a pretrained policy work like a charm.

But when I run a training-session it seems that the actual training is never reached, because the samples-buffer seems to not fill up.

I let a session run for more than an hour, but all I got is continously "Agent 0" "Samples 0"

But with zero samples in the buffer rl._agent.py->_train() will neither gain condition "self._total_sample_count >= self.init_samples" nor "self.replay_buffer_initialized" and therefore will never reach train_step(). Also no logging happens, file stays complete empty.

Do you have any idea why I only got "Agent 0" no matter how many workers I use and why the agent or agents do not gather samples?

Thank you in advance, bannox

xbpeng / DeepMimic

training only prints out: "Agent 0" + "Samples 0" #119