PacktPublishing / Tensorflow-2-Reinforcement-Learning-Cookbook

Tensorflow 2 Reinforcement Learning Cookbook, published by Packt
https://praveenp.com/deeprl/tensorflow-2.x-reinforcement-learning-cookbook/
MIT License
187 stars 87 forks source link

Where is the hidden state in the DRQN code? #54

Open evbtst opened 1 year ago

evbtst commented 1 year ago

Hello!

First of all, I want to commend your code, it's excellent! Thank you very much for your work!

However, I have a question regarding the file Chapter03/4_drqn.py. Shouldn't it be possible to access the hidden states of the LSTM? Also, shouldn't it be reset at the beginning of each epoch? I looked for this in the book as well, and it wasn't clear to me.

praveen-palanisamy commented 1 year ago

Hi @evbtst, Thank you for your kind words!

  1. Shouldn't it be possible to access the hidden states of the LSTM?

Yes, it's possible to access the hidden state of the LSTM that the Agent's model uses.

Based on this definition:

https://github.com/PacktPublishing/Tensorflow-2-Reinforcement-Learning-Cookbook/blob/91df46508e5672155d9f82f684d0f5d68680ecdf/Chapter03/4_drqn.py#L68-L76

that the Agent uses for its model and target_model, you can access the hidden states using the following:

agent_hidden_states = agent.model.model.layers[1].states

You can inspect the state in your IDE or print it to console using:

for state in agent_hidden_states:
    print(state.numpy()
  1. Also, shouldn't it be reset at the beginning of each epoch?

Good question! By default, the LSTM layer we use has stateful set to False, which is the typical use-case (unless we intentionally want to carryover states from the previous batch as a way to process longer sequences). This means that, after each batch of predictions (or training), the hidden states are reset. There is no explicit state reset used in the code but if you would like to experiment, you could do so by using: agent.model.model.layers[1].reset_states()

I hope that gives you clarifications that you were looking for. Thank you for your interest in the book and this code repository.