Hi!
Could you please explain why you make error clipping such a way?
`
$ git diff a83c4b359b9
...
.- # Clip the error term to be between -1 and 1
.- error = y - q_value
.- clipped_error = tf.clip_by_value(error, -1, 1)
.- loss = tf.reduce_mean(tf.square(clipped_error))
.+ # Clip the error, the loss is quadratic when the error is in (-1, 1), and linear outside of that region
.+ error = tf.abs(y - q_value)
.+ quadratic_part = tf.clip_by_value(error, 0.0, 1.0)
.+ linear_part = error - quadratic_part
.+ loss = tf.reduce_mean(0.5 * tf.square(quadratic_part) + linear_part)
`
It seems like good improvement but not like in original paper. What is the benefit?
Thanks!
Hi! Could you please explain why you make error clipping such a way? ` $ git diff a83c4b359b9 ... .- # Clip the error term to be between -1 and 1 .- error = y - q_value .- clipped_error = tf.clip_by_value(error, -1, 1) .- loss = tf.reduce_mean(tf.square(clipped_error)) .+ # Clip the error, the loss is quadratic when the error is in (-1, 1), and linear outside of that region .+ error = tf.abs(y - q_value) .+ quadratic_part = tf.clip_by_value(error, 0.0, 1.0) .+ linear_part = error - quadratic_part .+ loss = tf.reduce_mean(0.5 * tf.square(quadratic_part) + linear_part)
` It seems like good improvement but not like in original paper. What is the benefit? Thanks!