[MJX] How to keep action values consistent in single episode?

google-deepmind / mujoco

Multi-Joint dynamics with Contact. A general purpose physics simulator.

https://mujoco.org

Apache License 2.0

8.19k stars 819 forks source link

[MJX] How to keep action values consistent in single episode? #1916

Closed ChanMook closed 2 months ago

ChanMook commented 2 months ago

Hi, I'm working on MuJoCo MJX.

I set the action value of MJX as the parameters to make torque. The action value(control parameter) changes at every step, causing problems in control. I want to find optimal fixed action on single episode. how can i do?

Balint-H commented 2 months ago

My suspicion is that this could be fixed if you set the action_repeat field in your training to be equal to your episode length. Then I believe the agent will make a single action at the start of the episode (still sampled randomly from a distribution, but kept constant)

Balint-H commented 2 months ago

Alternatively, for fixed action policies you can consider non RL methods (just MJX without Brax), like grid searching or Bayesian optimisation.

google-deepmind / mujoco

[MJX] How to keep action values ​​consistent in single episode? #1916

[MJX] How to keep action values consistent in single episode? #1916