emad-arezoomand / online-actor-critic-algorithm-to-solve-continues-time-infinite-horizon-optimal-control-problem

online actor critic algorithm to solve the continues time infinite horizon optimal control problem
7 stars 0 forks source link

online-actor-critic-neural-net-optimal-controler #2

Open Yuqing0127 opened 3 years ago

Yuqing0127 commented 3 years ago

I have the same problem as in first floor

emad-arezoomand commented 3 years ago

i beg your pardon ?

Yuqing0127 commented 3 years ago

The problem is that the optimal critic network weight is certain, but the actual critic network weight will not converge to this value. And the given initial guess critic network weight is different, we get the different final critic network weight. This means that we can not obtain the optimal control policy by this code or this method.

emad-arezoomand commented 2 years ago

i agree with you

boxaio commented 1 year ago

I guess this kind of algorithms or works all have the same problem. Adding persistent noise is not enough for convergence to the right weight values, while the formal proof given in the paper neglects the practical techniques that ensures the right solution.