请问actor-critic中的critic预测价值，可以设计为预测action value分布吗？ - Githubissues

MorvanZhou / Reinforcement-learning-with-tensorflow

Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学

https://mofanpy.com/tutorials/machine-learning/reinforcement-learning/

MIT License

8.91k stars 5.01k forks source link

请问actor-critic中的critic预测价值，可以设计为预测action value分布吗？ #180

Open Hins opened 4 years ago

Hins commented 4 years ago

然后取相应action的value计算v和v'