Hi, I found that the gradients calculation of reduce_max in CNTK is different from other deep learning libraries such as tensorflow. And I want to know is it a bug?
Here is an example code using CNTK2.7:
import numpy as np
import cntk as C
x = C.input_variable(shape=(1, 3, 1), needs_gradient=True)
x_val = np.array([[[0.6],
[0.6],
[0.3]]])
y = C.reduce_max(x)
g = y.grad({x: x_val})
print("gradients of max: ", g)
The result is:
gradients of max: [[[[1.]
[1.]
[0.]]]]
And this is the code using TensorFlow2.6.0:
import numpy as np
import tensorflow as tf
with tf.GradientTape() as tape:
x = tf.Variable([[[0.6],
[0.6],
[0.3]]])
y = tf.reduce_max(x)
g = tape.gradient(y, x)
print("gradients of max: ", g.numpy())
The result is:
gradients of max: [[[0.5]
[0.5]
[0. ]]]
The inconsistency exists when there are multiple max elements.
Hi, I found that the gradients calculation of
reduce_max
in CNTK is different from other deep learning libraries such as tensorflow. And I want to know is it a bug?Here is an example code using CNTK2.7:
The result is:
And this is the code using TensorFlow2.6.0:
The result is:
The inconsistency exists when there are multiple max elements.
Any replies will be appreciated.