CampusAI /

DeepRL content website:
2 stars 1 forks source link

404 link in Lecture 6 #4

Closed fedetask closed 4 years ago

fedetask commented 4 years ago

In Lecture 6, section Reducing Policy Gradient Variance, in the sentence

OBS: If we don't want to fit something that takes both states and actions we can just fit $V^{\pi}$ at the cost of using a single-sample estimate for $s_{t+1}$. We will do this for now, to fit $Q^\pi$ look into Q-learning methods.

the link to Q-learning is missing. Did you mean to link the lecture 7 or lecture 8, or some external references?

OleguerCanal commented 4 years ago

Thanks for noticing, I'll link it to lectue 7