Open HappyTiger1 opened 3 years ago
notice the difference between w2change and actual formula in the paper. the convergence is proved in the paper via lyapunov analysis.
Thank you for your answer! In the paper, the linear model has a determinstic critic weights value [1.4279 1.1612 +0.1366 1.4462 +0.1480 0.4317], but through the algorithm I can't get that. How to explain this?
I find the answer, the persistence excitation should be added.
@HappyTiger1 do you know how to add persistence excitation for this code, and could you share me your code?
i updated the code , this algorithm does not yield the promised results.
Have you seen this articel? I have a problem. The weights of the actor NN is W2, and how to choose the value of F1 in Theorem2. I think this paper is wrong.
When the learning rate or initial critic weights change, why the critic weights converge to different values? How to justify its optimality?