interpreting-rl-behavior / interpreting-rl-behavior.github.io

Code for the site https://interpreting-rl-behavior.github.io/
Creative Commons Attribution 4.0 International
0 stars 0 forks source link

Get all required target functions working #61

Closed leesharkey closed 2 years ago

leesharkey commented 2 years ago
leesharkey commented 2 years ago

Demoting the urgency of this because I think it'll likely take too long to fix.

Basically the optimization is just extremely unstable. It's kind of working for most except IC directions.

My main hypothesis as to why:

There are multiple discrete variables in the latents (the discrete categorical vars in the RSSM and also the action space). This means that slight changes in the bottleneck vector can lead to very different samples.

Potential future solution:

leesharkey commented 2 years ago

We're not going to do target functions. Dataset examples will suffice