The tensor dimensions are transposed in the example - by mistake, I suppose. As a result the loss rate did not converge and stayed high at 0.5 .. 0.6, which can be seen in the Coursera video.
Solution
Flip the dimensions: use [4,1] instead of [1,4]. Now the error rate converges to well below 0.2.
Problem
The tensor dimensions are transposed in the example - by mistake, I suppose. As a result the loss rate did not converge and stayed high at 0.5 .. 0.6, which can be seen in the Coursera video.
Solution
Flip the dimensions: use [4,1] instead of [1,4]. Now the error rate converges to well below 0.2.