Closed tkkim-robot closed 3 years ago
Thank you for investigating! I had some other use cases where having an additional cost on the action made sense (hence why running cost is q(x,u)
instead of q(x)
), but having no action cost in the examples might be better.
What?
I've changed the cost function in pendulum tests. The cost for action should be removed in the 'running_cost(state, action)' functions.
Why?
As Algorithm 1 described in the paper, 'q(x)' represents the cost of the estimated state, and the successive term ('r u sigma * (u - t)' ) represents the cost of the control action. Since the control cost is already presented in 'perturbation_cost', the control cost in the 'running_cost()' should be removed. 'q(x)' is only depend on the state.
How?
I've just simply deleted the control cost in the three tests scripts respectively.
Testing?
The pendulum examples were tested after fixing the cost functions and the results showed that the pendulums get upright faster than the previous cost functions.
Thanks for your great work.