tokb23 / dqn

DQN implementation in Keras + TensorFlow + OpenAI Gym
158 stars 58 forks source link

Question about clipping #8

Open aisurfer opened 6 years ago

aisurfer commented 6 years ago

Hi! Could you please explain why you make error clipping such a way? ` $ git diff a83c4b359b9 ... .- # Clip the error term to be between -1 and 1 .- error = y - q_value .- clipped_error = tf.clip_by_value(error, -1, 1) .- loss = tf.reduce_mean(tf.square(clipped_error)) .+ # Clip the error, the loss is quadratic when the error is in (-1, 1), and linear outside of that region .+ error = tf.abs(y - q_value) .+ quadratic_part = tf.clip_by_value(error, 0.0, 1.0) .+ linear_part = error - quadratic_part .+ loss = tf.reduce_mean(0.5 * tf.square(quadratic_part) + linear_part)

` It seems like good improvement but not like in original paper. What is the benefit? Thanks!