Closed ChanMook closed 2 months ago
My suspicion is that this could be fixed if you set the action_repeat field in your training to be equal to your episode length. Then I believe the agent will make a single action at the start of the episode (still sampled randomly from a distribution, but kept constant)
Alternatively, for fixed action policies you can consider non RL methods (just MJX without Brax), like grid searching or Bayesian optimisation.
Hi, I'm working on MuJoCo MJX.
I set the action value of MJX as the parameters to make torque. The action value(control parameter) changes at every step, causing problems in control. I want to find optimal fixed action on single episode. how can i do?