API and algorithm structure unification

jfpettit / flare

Modular Reinforcement Learning in PyTorch.

MIT License

3 stars 1 forks source link

API and algorithm structure unification #2

Open jfpettit opened 4 years ago

jfpettit commented 4 years ago

Algorithms in qpolgrad have been organized to define functions for loss calculation. Those functions are then called in the update function for the algorithm. A2C and PPO need to be brought up to that same structure.

Specifically:

Define compute_policy_loss and compute_value_loss functions in A2C and PPO.
Modify the update rules for both algorithms to call the loss computation functions.
Update docstrings to reflect your changes! If there aren't docstrings (sorry), add them!

👍

jfpettit commented 4 years ago

Working on this. Converting code to PyTorch Lightning for consistency in structure and for the excellent automated capabilities Lightning offers.