tensorflow / agents

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
Apache License 2.0
2.78k stars 719 forks source link

Learner.run got stuck #782

Closed Rejuy closed 1 year ago

Rejuy commented 1 year ago

Dear authors: Thanks for designing tf_agents! I've encountered a problem when I was running the code of google research project called circuit-training. It created an object of Learner (we can call it learner). When it came to learner.run, it called learner._train. I added some log for help. In the outcome, I found that all log was printed out in learner._train, indicating _train was done. However, in learner.run, the log instructions right after calling learner._train was never printed out, which means that the function was actually not returned (the whole training process got stuck). How could this happen? I got no idea. Could you give me some advice? Thanks a lot!!!!

  def run(self, iterations=1, iterator=None, parallel_iterations=10):
    """ ...
    """
   # do things...
    with self.train_summary_writer.as_default(), \
         common.soft_device_placement(), \
         tf.compat.v2.summary.record_if(_summary_record_if), \
         self.strategy.scope():
      iterator = iterator or self._experience_iterator
      loss_info = self._train(tf.constant(iterations),
                              iterator,
                              parallel_iterations)
      logging.info("return back to run")  # never printed out
      train_step_val = self.train_step.numpy()
      for trigger in self.triggers:
        trigger(train_step_val)

      return loss_info

  @common.function(autograph=True)
  def _train(self, iterations, iterator, parallel_iterations):
    # ...
    logging.info("_train start")  # printed out
    # do things
    logging.info("_train end")  # printed out
    return reduced_loss_info
@misc{CircuitTraining2021,
  title = {{Circuit Training}: An open-source framework for generating chip
  floor plans with distributed deep reinforcement learning.},
  author = {Guadarrama, Sergio and Yue, Summer and Boyd, Toby and Jiang, Joe
  Wenjie and Songhori, Ebrahim and Tam, Terence and Mirhoseini, Azalia},
  howpublished = {\url{https://github.com/google_research/circuit_training}},
  url = "https://github.com/google_research/circuit_training",
  year = 2021,
  note = "[Online; accessed 21-December-2021]"
}
sguada commented 1 year ago

Can you make sure that the actors are generating the data that the learner needs?

For instance can you get data by doing

next(learner._experience_iterator)
Rejuy commented 1 year ago

I solved this problem by changing some parameters in the program. Thx a lot!

Can you make sure that the actors are generating the data that the learner needs?

For instance can you get data by doing

next(learner._experience_iterator)
kikushah commented 8 months ago

@Rejuy - I am facing the same issue, can you share how you solved this issue?

@sguada - next(learner._experience_iterator) is generating the data. At the end of the train function - return reduced_loss_info never gets returned.