google / dopamine

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
https://github.com/google/dopamine
Apache License 2.0
10.42k stars 1.36k forks source link

[question] Printing model summary #176

Closed rfali closed 2 years ago

rfali commented 3 years ago

I was going through this Colab and trying to make a custom agent on the pattern of MyRandomDQNAgent(dqn_agent.DQNAgent). The DQNAgent's network is specified here which is in turn the NatureDQNNetwork specified here. Now this is going to sound stupid, but I have a similar object (a multi headed DQN built in Dopamine) that I am trying to recreate in another library (RLlib). I wanted to print the model.summary() of this Keras.Model to make sure both are equivalent, but I am really struggling with it. If someone can help point out a solution. Thanks!

psc-g commented 3 years ago

hi farrukh, have you tried inspecting your model with tensorboard? in the "Graph" tab it should provide the setup of your model.

On Thu, May 6, 2021 at 9:45 AM Farrukh Ali @.***> wrote:

I was going through this Colab https://github.com/google/dopamine/blob/master/dopamine/colab/agents.ipynb and trying to make a custom agent on the pattern of MyRandomDQNAgent(dqn_agent.DQNAgent). The DQNAgent's network is specified here https://github.com/google/dopamine/blob/90527f4eaad4c574b92df556c02dea45853ffd2e/dopamine/agents/dqn/dqn_agent.py#L83 which is in turn the NatureDQNNetwork specified here https://github.com/google/dopamine/blob/90527f4eaad4c574b92df556c02dea45853ffd2e/dopamine/discrete_domains/atari_lib.py#L133 . Now this is going to sound stupid, but I have a similar object (a multi headed DQN built in Dopamine) that I am trying to recreate in another library (RLlib). I wanted to print the model.summary() of this Keras.Model to make sure both are equivalent, but I am really struggling with it. If someone can help point out a solution. Thanks!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/google/dopamine/issues/176, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE3CCMKQXESVYAIBULTH2TTTMKMPXANCNFSM44HCXLDQ .

rfali commented 3 years ago

Small post but the pictures make it look bigger. Question at the end.

I started with this Tensorboard Tutorial and as I wanted to see a keras model summary, I was interested in seeing the Conceptual Graph. I ran their Colab too to make sure. It works well and I could see the Conceptual Graph under the keras Tag. This is how it looks like image
and here is a screenshot from the Tensorboard Tutorial highlighting the Keras Tag. image

I ran the Dopmaine Cartpole Colab and put the tensorboard callback as well. tensorboard_callback = keras.callbacks.TensorBoard(log_dir=DQN_PATH) I was able to see the Graph but not the Conceptual Graph, as their was no Keras Tag :( image
I was able to run the agents Colab and visualize the Graph, but not the Conceptual Graph (same as above). Here is that image image
Finally, I made a Colab myself (sharing it here) (sourced from here) to experiment with a Pong agent (to check the atari_lib.NatureDQNNetwork). Unfortunately this happened image

I checked the callback API to see if there is an option to enable Tags, but didn't any.

Is there a way I can play Pong or the custom agent colab and see the Keras Tag in Tensorboard?

I know this is not by itself any issue with Dopamine, but I would greatly appreciate any help for the question above. Thanks.

psc-g commented 3 years ago

make sure you're setting debug_mode to true when creating the agents: https://github.com/google/dopamine/blob/master/dopamine/discrete_domains/run_experiment.py#L61

you can do this either via gin bindings: --gin_bindings="create_agent.debug_mode=True"

or just change the code directly. otherwise, the agent will not be writing event files for tensorboard.

On Fri, May 7, 2021 at 9:26 AM Farrukh Ali @.***> wrote:

Small post but the pictures make it look bigger. Question at the end.

I started with this Tensorboard Tutorial https://www.tensorflow.org/tensorboard/graphs and as I wanted to see a keras model summary, I was interested in seeing the Conceptual Graph https://www.tensorflow.org/tensorboard/graphs#conceptual_graph. I ran their Colab https://colab.research.google.com/github/tensorflow/tensorboard/blob/master/docs/graphs.ipynb too to make sure. It works well and I could see the Conceptual Graph under the keras Tag. This is how it looks like [image: image] https://user-images.githubusercontent.com/45505149/117454606-f089dd00-af0b-11eb-88eb-2667da20b577.png

and here is a screenshot from the Tensorboard Tutorial highlighting the Keras Tag. [image: image] https://user-images.githubusercontent.com/45505149/117454217-82ddb100-af0b-11eb-8358-77163ad44ac1.png

I ran the Dopmaine Cartpole Colab https://github.com/google/dopamine/blob/master/dopamine/colab/cartpole.ipynb and put the tensorboard callback as well. tensorboard_callback = keras.callbacks.TensorBoard(log_dir=DQN_PATH) I was able to see the Graph but not the Conceptual Graph, as their was no Keras Tag :( [image: image] https://user-images.githubusercontent.com/45505149/117453499-a3593b80-af0a-11eb-9748-db2fcc8c1840.png

I was able to run the agents Colab https://github.com/google/dopamine/blob/master/dopamine/colab/agents.ipynb and visualize the Graph, but not the Conceptual Graph (same as above). Here is that image [image: image] https://user-images.githubusercontent.com/45505149/117454083-59248a00-af0b-11eb-8124-d89f4c2c37a7.png

Finally, I made a Colab myself (sharing it here https://colab.research.google.com/drive/19NSk-9Q7OuS8yTeloWhE92JOyd5xaA7Q?usp=sharing) (sourced from here https://towardsdatascience.com/deep-reinforcement-learning-for-video-games-made-easy-6f7d06b75a65) to experiment with a Pong agent (to check the atari_lib.NatureDQNNetwork). Unfortunately this happened [image: image] https://user-images.githubusercontent.com/45505149/117455926-47dc7d00-af0d-11eb-8139-2ffd5a93092a.png

I checked the callback API https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/TensorBoard to see if there is an option to enable Tags, but didn't any.

Is there a way I can play Pong or the custom agent colab and see the Keras Tag in Tensorboard?

I know this is not by itself any issue with Dopamine, but I would greatly appreciate any help for the question above. Thanks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/google/dopamine/issues/176#issuecomment-834385304, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE3CCML5HXMWV5ZVPPZPR73TMPS7HANCNFSM44HCXLDQ .

rfali commented 3 years ago

I added the debug_mode=True in the Dopamine agents colab, as follows def create_random_dqn_agent(sess, environment, summary_writer, debug_mode=True):

and decreased the number of iterations to 10, still the Tensorboard output was like this image

psc-g commented 3 years ago

it still says debug_mode=False?

On Fri, May 7, 2021 at 1:16 PM Farrukh Ali @.***> wrote:

I added the debug_mode=True in the Dopamine agents colab, as follows def create_random_dqn_agent(sess, environment, summary_writer, debug_mode=False):

and decreased the number of iterations to 10, still the Tensorboard output was like this [image: image] https://user-images.githubusercontent.com/45505149/117485425-fa243c80-af2d-11eb-8a38-dc35b8c2b964.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/google/dopamine/issues/176#issuecomment-834632827, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE3CCMNDFXQDPG5BOVGTKATTMQN6RANCNFSM44HCXLDQ .

rfali commented 3 years ago

sorry, that was a typo at that moment, but I had run it with debug=True. Here is the Colab I am working on with just that single modification to the agents colab. Still no tensorboard event files. Perhaps you can point out what I am doing wrong in this colab or the agents colab?

psc-g commented 3 years ago

sorry, that link did not work. if it works with the agents colab but not with your modified colab, then the issue must lie in the modification somewhere. i'd suggest retracing your steps?

On Fri, May 7, 2021 at 2:40 PM Farrukh Ali @.***> wrote:

sorry, that was a typo at that moment, but I had run it with debug=True. Here is the Colab https://colab.research.google.com/drive/1T18giS-U1VF09tVL3iCB8CHSrFxJuiJA?usp=sharing http://url I am working on with just that single modification to the agents colab.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/google/dopamine/issues/176#issuecomment-834684073, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE3CCMPHXS4W2UZ7MAIXRGLTMQX3VANCNFSM44HCXLDQ .

rfali commented 3 years ago

@psc-g sorry, I thought I had made the Colab public, but it wasn't. You can try it now here for a glance.

To cut to the chase, I am trying to modify this Dopamine based DQN agent for use in another library(RLlib) since I wanted to run this in a multi-agent environment. Please see this issue for more details and a figure of the network architecture.

Due to your familiarity with Dopamine, can you please answer the following if you can:

  1. The network here is outputting multiple Q-heads, one for each gamma. Is that correct? I was actually trying to print this model's summary to confirm the model outputs and shape when i posted my question.
  2. Since there is a single tensor output of Q-values (one for each action) in a vanilla DQN, is the model above outputting multiple tensors or a list of tensors? I think for n gammas it is outputting n tensors of Q-values, and the hyp_q_value is also a tensor, which is calculated through the integral function here. So the output of the model still only has Q-heads equal to num_gammas.
  3. The action is selected according to the acting_policy (last q-head or integral of all q-heads), but a separate loss is calculated for each head (using current_Q and target_Q), individual loss is aggregated and scaled, and then minimized through gradient descent. Is my thinking correct?

Thank you for your time, I would appreciate your help.