[FR] Contrib module for agent-based models and model-based RL

karalets commented 4 years ago

This issue proposes a pyro.contrib.agents module for building agent-based models and using control to guide their actions to maximize rewards.

This modeling task is commonly described as model-based reinforcement learning and will have the structure of a Markov Decision Process (MDP). We will use some previous publications (in particular NIPS2018WS and ICML2019WS) which incorporate latent structure into agent- and reward-models to guide this contrib and aim to iterate on versions of these models tutorial-style.

The underlying aim is to show how the model-control-exploration loop works in pyro and to use the latent variable models to demonstrate how transferrable structure can be learned and used for agent models.

Tasks

[ ] add an example for a simulated environment, i.e. CartPole
[ ] add an example for a dynamics model that will mimic the simulator, i.e. a neural network
[ ] add a control distribution as an inference technique to guide the dynamics model to high rewards
[ ] demonstrate control on known dynamics without learning
[ ] demonstrate the model-predictive control loop using the control distribution with learning of the dynamics in the loop
[ ] add a latent variable version of the dynamics model to capture hidden properties of the unknown agents following the referenced papers
[ ] utilize the latent variable and inference on it for control
[ ] write tutorial on this type of model

Rish001 commented 4 years ago

@karalets are you still working on this?

karalets commented 4 years ago

yeah I can revisit this pretty soon, I had a lot of this ready but got sidetracked. Thanks for the ping.

BartekSzpak commented 3 years ago

Hi @karalets, would be also very excited to see your work implemented in Pyro.

pyro-ppl / pyro

[FR] Contrib module for agent-based models and model-based RL #1964

Tasks