Define systems and control objectives with Brax envs rather than the custom system/task breakdown we were using before.
This increases bloat, and makes it more awkward to implement terminal costs. But the bloat should be worth it because (1) this will smooth the way to using MJX and (2) it will allow us to baseline more easily against the RL algorithms in Brax as well as MBD.
Define systems and control objectives with Brax
envs
rather than the customsystem
/task
breakdown we were using before.This increases bloat, and makes it more awkward to implement terminal costs. But the bloat should be worth it because (1) this will smooth the way to using MJX and (2) it will allow us to baseline more easily against the RL algorithms in Brax as well as MBD.