zt95 / infinite-horizon-off-policy-estimation

13 stars 7 forks source link

Discrete G #4

Open clvoloshin opened 5 years ago

clvoloshin commented 5 years ago

I'm trying to get the discrete (discounted) case to work for a toy mdp, but it doesn't seem to be giving sensical results. I think maybe I'm doing something wrong.

Could you explain what G, Nstate, Ghat are and how they relate to \Delta and k(s,s') in the paper?