About the IQN loss - Githubissues

google / dopamine

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.

Apache License 2.0

10.56k stars 1.38k forks source link

Hi everyone, I am reading the great IQN paper and following the implementation, but I find that the definition of loss function is slightly different from described in IQN and the previous QR-DQN: https://github.com/google/dopamine/blob/master/dopamine/agents/implicit_quantile/implicit_quantile_agent.py#L348

Why is the final quantiled huber loss divided by kappa ?

quantile_huber_loss = ( tf.abs(replay_quantiles - tf.stop_gradient( tf.to_float(bellman_errors < 0))) * huber_loss) / self.kappa

Although kappa equals to 1.0, thus make no difference. I'm confused here, is it a typo?

google / dopamine