opendilab / PPOxFamily

PPO x Family DRL Tutorial Course(决策智能入门级公开课:8节课帮你盘清算法理论,理顺代码逻辑,玩转决策AI应用实践 )
https://opendilab.github.io/PPOxFamily/
Apache License 2.0
1.89k stars 169 forks source link

The flaw in chapter1_supp_trpo #2

Closed MarginalCentrality closed 1 year ago

MarginalCentrality commented 1 year ago

Given two functions f_1 and f_2, f_1(x_0) = f_2(x_0) and f'_1(x_0) = f'_2(x_0), there may do not exist a ball B around x_0 such that: \forall x_1, x_2 \in B, f_1(x_2) >= f_1(x_1) \rightarrow f_2(x_2) >= f_1(x_1). For instance, f_1(x) = x^2, f_2(x) = -1 * x^2.

We may need more properties for this claim.

lichuming commented 1 year ago

Thanks for you correction! The stationary point is really a special case overlooked by us, we will fix this issue in the later versions.