JinraeKim commented 2 years ago

Tao Bian 의 value iteration (VI) 기반 CT ADP 를 구현한다 [1, 2].

Kleinman 알고리즘을 online+partially model-free 로 가져간 것이 linear IRL 의 시작인 만큼 [3], ~~VI CT ADP [1, 2] 를 online 으로 확장하거나~~ ~~([1] 을 읽을 때는 왜 online 으로 안 썼을까 의문이었는데, [2] 에서 자연스럽게 확장함)~~, Hamiltonian 의 infimum 부분을 대체하는 것도 좋은 연구 방향이라고 생각됩니다.

참고로 [3] 은 구현됨

Refs

[1] T. Bian and Z.-P. Jiang, “Value Iteration, Adaptive Dynamic Programming, and Optimal Control of Nonlinear Systems,” in 2016 IEEE 55th Conference on Decision and Control (CDC), Las Vegas, NV, USA, Dec. 2016, pp. 3375–3380. doi: 10.1109/CDC.2016.7798777. [2] T. Bian and Z.-P. Jiang, “Reinforcement Learning and Adaptive Optimal Control for Continuous-Time Nonlinear Systems: A Value Iteration Approach,” IEEE Trans. Neural Netw. Learning Syst., pp. 1–10, 2021, doi: 10.1109/TNNLS.2020.3045087. [3] D. Vrabie, O. Pastravanu, M. Abu-Khalaf, and F. L. Lewis, “Adaptive Optimal Control for Continuous-Time Linear Systems Based on Policy Iteration,” Automatica, vol. 45, no. 2, pp. 477–484, Feb. 2009, doi: 10.1016/j.automatica.2008.08.017.

JinraeKim commented 2 years ago

이 프로젝트 에서 진행하겠습니다

JinraeKim commented 2 years ago

fdcl-data-driven-control / data-driven-control

Initial stabilising gain 이 필요없는 VI 계열의 CT ADP 구현 #10

Refs

14 가 선행되어야할듯