Closed boluoweifenda closed 6 years ago
Hi everyone, I am reading the great IQN paper and following the implementation, but I find that the definition of loss function is slightly different from described in IQN and the previous QR-DQN: https://github.com/google/dopamine/blob/master/dopamine/agents/implicit_quantile/implicit_quantile_agent.py#L348
Why is the final quantiled huber loss divided by kappa ?
quantile_huber_loss = ( tf.abs(replay_quantiles - tf.stop_gradient( tf.to_float(bellman_errors < 0))) * huber_loss) / self.kappa
Although kappa equals to 1.0, thus make no difference. I'm confused here, is it a typo?
hi, no it's not a typo. if you look at the definition of $\rho^{\kappa}_{\tau}$ in the paper you can see that $L_{\kappa}(\delta_{ij}}$ (the huber loss) is divided by $\kappa$.
$\rho^{\kappa}_{\tau}$
$L_{\kappa}(\delta_{ij}}$
$\kappa$
Hi everyone, I am reading the great IQN paper and following the implementation, but I find that the definition of loss function is slightly different from described in IQN and the previous QR-DQN: https://github.com/google/dopamine/blob/master/dopamine/agents/implicit_quantile/implicit_quantile_agent.py#L348
Why is the final quantiled huber loss divided by kappa ?
Although kappa equals to 1.0, thus make no difference. I'm confused here, is it a typo?