awjuliani / DeepRL-Agents

A set of Deep Reinforcement Learning Agents implemented in Tensorflow.
MIT License
2.24k stars 826 forks source link

A3C Doom: Visualize Gradients (tricky) #36

Open IbrahimSobh opened 7 years ago

IbrahimSobh commented 7 years ago

Dears @awjuliani @DMTSource

Inspired by this paper

I wanted to know where the agent is looking while taking decisions. In other words, what are the pixels of input frame that have large influence on the decision it takes.

I have added the following line in AC_Network class:

self.input_gradients = tf.gradients(self.responsible_outputs,self.imageIn)[0]

so, given a batch of input frames, for each frame we have a map that tells us what are the important pixels.

It looks like this:

doombasicgradexample

Looks nice! After a good training, the agent at this particular moment, and for a given input frame, the agent looks at the center of the input frame where the demon probably is located.

My Question is:

We are using LSTM and batch size = 30 (or less if episode ends). For example: If we have batch of 30 frames, then we will have 30 corresponding maps. I want to know if each map depends on:

  1. Only the current frame (ex: map number 11 depends on input frame number 11 in this batch)
  2. OR, The previous frames (ex: map number 11 depends on input frames number 1 to 11 in this batch)
  3. Or, All frames in the batch (ex: map number X depends on all input frames from 1 to 30 in this batch)

Reason for asking: In some cases, and after training, I turn off some frames to test how the agent will act in case of noisy input. So, for example, in a batch of 30 frames, around 7 random frames are just black images. The agent performed well. However, The gradient for these black frames still look like the above figure.

doombasicgradexample

These pixels were important to agent, however the whole input frame was black, why?!

Hint: I think the correct answer is number (2). The decision is taken given all previous frames. If the current frame is black, the gradient of the decision given the current black frame depends on the current frame and all previous frames too.

What do you think?!

Regards