Guiliang / Sport-Analytic-NN

Neural Network Realization of Project Sport Analytic
3 stars 1 forks source link

Reading code about TD(lambda) #8

Open Guiliang opened 7 years ago

Guiliang commented 7 years ago

read the code of TD(lambda) here https://github.com/Guiliang/Sport-Analytic-NN/blob/master/td_prediction_eligibility_trace.py, focus on the gradient descent in neural network structure. I

Guiliang commented 7 years ago

see Gradient Descent Sarsa(λ) in http://classes.engr.oregonstate.edu/mime/fall2008/me539/Lectures/ME539-w6-RL2_notes.pdf, try to implement it with tensorflow

zeruniverse commented 7 years ago

similar to weighted average of last few states

zeruniverse commented 7 years ago

The code seems correct. If it does not converge, please set the alpha to 1e-3 instead of 1e-2. You can verify the code by setting lamda = 0 and run. If the code is correct, the result should be same as TD.