In the MOMPO model, I see a lot of 'task', 'objective', and 'reward', but I'm not sure where the objectives are input to this model.
Sorry, it's quite a lot of code and I'm not sure where something like a humanoid tasked with, 'stand up straight', or 'run' in MuJoCo would go.
I know exactly what objectives to input, like maximizing the head's distance from the ground, minimizing force, and maximizing velocity, but I don't see where these observations are set as objectives. I know I could figure this out, but a point in the direction would help.
A note from acme/agents/tf/mompo/learning.py around line 86.
the action norm objective in the Humanoid run task is defined by setting the qvalue_fn to be the l2-norm of the actions.
You lost me here. Is there an example of this code somewhere?
In the MOMPO model, I see a lot of 'task', 'objective', and 'reward', but I'm not sure where the objectives are input to this model.
Sorry, it's quite a lot of code and I'm not sure where something like a humanoid tasked with, 'stand up straight', or 'run' in MuJoCo would go.
I know exactly what objectives to input, like maximizing the head's distance from the ground, minimizing force, and maximizing velocity, but I don't see where these observations are set as objectives. I know I could figure this out, but a point in the direction would help.
A note from acme/agents/tf/mompo/learning.py around line 86.
the action norm objective in the Humanoid run task is defined by setting the qvalue_fn to be the l2-norm of the actions.
You lost me here. Is there an example of this code somewhere?