Closed laocaoshilaocao closed 3 years ago
expand_dims() is required because autograd (more like ndarray) doesn't perform implicit broadcast. Is tile() required?
You are right the tile() is not required between 2D and 1D. But i think that is required when the calculation is between 3D and 2D (which is my situation). The example for me is like:
g.expand_dims([4x5], &[1]) - [2x5]
So the first part need to be tiled through axis 1 from 4x1x5 to 4x2x5, and part 2 need to be tiled through axis 0 from 1x2x5 to 4x2x5
Btw the computation process works fine, but the gradient process just doesn't without dimension expand.
But i think that is required when the calculation is between 3D and 2D
Do you expect 2D -> 3D broadcast? Does TF support it??
Do you expect 2D -> 3D broadcast? Does TF support it??
Yes i think TF supports that. Example is like:
hidden = tf.constant([[1.0,2.0], [1.0,2.0], [1.0,2.0],[1.0,2.0]])
clusters = tf.constant([[1.0,1.0]])
dist1 = K.expand_dims(hidden, axis=1) - clusters
Hmm in the above example implicit broadcast (1,2) -> (4,1,2) looks occurring but autograd
and ndarray
requires one more empty dim. (tile() is not needed)
// (1,2) -> (1, 1, 2)
expand_dims(clusters, axis=0)
I think implicit broadcast is evil and the ndarray's spec is reasonable:thinking:
Hmm in the above example implicit broadcast (1,2) -> (4,1,2) looks occurring but autograd and ndarray requires one more empty dim. (tile() is not needed)
haha okay i got that. You are right the implicit broadcasting makes the whole code quite mess to read. :)
(tile() is not needed)
But when i met this same situation, tile() is still needed i believe?
let points: Array2<f64> = array![[1.0,2.0], [1.0,4.0], [1.0,4.0], [10.0,2.0],[10.0,4.0],[10.0,0.0]];
let points_t: ag::Tensor<f64> = g.constant(points);
let centroids: Array2<f64> = array![[0.0,0.0], [1.0,1.0]];
let centroids_t: ag::Tensor<f64> = g.constant(centroids);
let points_expanded = g.expand_dims(points_t, &[0]);
let centroids_expanded = g.expand_dims(centroids_t, &[1]);
let t = points_expanded - centroids_expanded
If i run this i got the error like thread 'main' panicked at 'ndarray: could not broadcast array from shape: [2, 1, 2] to: [1, 6, 2]'
Hi, during the development of my neural network algorithm, i found that gradient method for tensors calculated between two different dimensions always has an error. The error includes
thread 'main' panicked at'called Result::unwrap() on an Err value:
andthread 'main' panicked at'called Result::unwrap() on an None value:
if i try to print thegrad
result. One example is like:That influences a lot for the expression using during the development. After testing, i found that expand t2's dimension can solve that problem by doing like this.
However, that way definitely requires much more effort for the development since method like
reduced_sum
is really commonly used. Do u have any other idea about solving this problem?