steveKapturowski / tensorflow-rl

Implementations of deep RL papers and random experimentation
Apache License 2.0
177 stars 47 forks source link

About actor_learner.py #15

Open lezhang-thu opened 7 years ago

lezhang-thu commented 7 years ago

Look at the following code:

        with self.monitored_environment(), session_context as self.session:
            self.synchronize_workers()

            if self.is_train:
                self.train()
            else:
                self.test()

After trying several times, I felt the "with ... as" will exit even if self.train() is still running ... self.train() is related to PseudoCountQLearner's train() function. I tried to catch tf.errors.OutOfRangeError, which tensorflow will not re-raise. But it seems it is not the answer. I have a feeling that tensorflow exits "with ... as" as no new training data are in its queue anymore. This might be due to the fact that PseudoCountQLearner have to compute MC mixed return. So PseudoCountQLearner waits for the end of the episode. But tensorflow cannot wait until that end.

All in all, I found no reason why the "with ... as" happened to exit earlier that self.train(). Thanks!