Caltech-AMBER / ambersim

In-house tools built on GPU-accelerated simulation
MIT License
7 stars 2 forks source link

Reduce PPO boilerplate #28

Closed vincekurtz closed 11 months ago

vincekurtz commented 12 months ago

Training a PPO agent requires a lot of boilerplate code (network factories, training functions, inference functions, etc). We should write some utils to reduce this boilerplate and offer sane defaults.

pculbertson commented 12 months ago

It's probably worth asking if we want to re-implement PPO and/or if there are other implementations out there we can use. The Brax PPO is pretty kludgy + I agree it's hard to use/understand