awjuliani / DeepRL-Agents

A set of Deep Reinforcement Learning Agents implemented in Tensorflow.
MIT License
2.23k stars 825 forks source link

Double-Dueling-DQN: question about the rate to update target network #62

Open oneQuery opened 6 years ago

oneQuery commented 6 years ago

I've encountered the thing that I can't understand while following up the Double-Dueling-DQN.ipynb.

There's a def like below

def updateTargetGraph(tfVars,tau):
    total_vars = len(tfVars)
    op_holder = []
    for idx, var in enumerate(tfVars[0:total_vars//2]):
        op_holder.append(tfVars[idx+total_vars//2].assign((var.value()*tau) + ((1-tau)*tfVars[idx+total_vars//2].value())))
    return op_holder

What does the op_holder mean and its role?

I skimmed the paper of Double DQN and Dueling DQN again, but I could not find out about the 'rate to update target network', which is indicated as 'tau' in this code.