interpreting-rl-behavior / interpreting-rl-behavior.github.io

Code for the site https://interpreting-rl-behavior.github.io/
Creative Commons Attribution 4.0 International
0 stars 0 forks source link

Generate data that demonstrates hx neuron target functions #14

Closed leesharkey closed 2 years ago

leesharkey commented 2 years ago

Corresponding text in draft:

Since the generative model is fully differentiable, we can optimise the VAE latent space vector used by the generator generates the samples so that they have certain interesting properties, such as cases where the agent experiences large drops in value (unexpected failures) or cases where the agent takes specific sequences of actions (like consistently moving backwards).

We can even optimize the activity of hidden state neurons such that they are maximally or minimally activated, a method as is commonly used to interpret vision networks. But when we do this, we find that most neurons in the agent's hidden state are difficult to interpret. It's not clear what they encode. This hints at important differences between the interpretation of RL agents and convolutional networks.

Figure that depicts samples that have been optimized for different target functions. This can be a panel so that a user can just select the target function name and see the associated videos. Should include hx neuron target function samples discussed below. The hx neuron target functions should be the default display.

danbraunai commented 2 years ago

@leesharkey I'm confused here. Is the task to generate the data, or to present it in the panel, or both? It looks like you've already written the code for the experiment in target_func_exps.py (the hx_neuron_target_function function).

leesharkey commented 2 years ago

Yeah it was mainly just to generate it but putting it in the panels wouldn't hurt either, but not necessary atm. The idea was to divide labour a bit since I'll be focusing on PC target functions. But until we get compute up and running again, both will have to wait.

leesharkey commented 2 years ago

When I get the compute up and running, I'll run this. Probably easier that way since I'll be dealing with setting things running anyway. I'll assign this to myself.

leesharkey commented 2 years ago

We're not going to do target functions. Closing.