issues
search
junxnone
/
aiwiki
AI Wiki
https://junxnone.github.io/aiwiki
18
stars
2
forks
source link
RL AC
#459
Open
junxnone
opened
12 months ago
junxnone
commented
12 months ago
Actor Critic
Actor
: - 演员 - 推荐动作,根据每个动作的概率值
将环境的状态作为输入,并为其动作空间中的每个动作返回一个概率值
Critic
: - 评论家 - 未来预期回报,预期在未来获得的所有汇报的总和
将的环境状态作为输入,并返回对未来总回报的估计
Reference
Actor Critic Method - Paddlepaddle
Actor Critic
Reference