Why the gradient is not defined as the transpose of the Jacobian Matrix, as indicated in "Matrix Differential Caculus"?

mml-book / mml-book.github.io

Companion webpage to the book "Mathematics For Machine Learning"

13.11k stars 2.42k forks source link

In the side note on pp.150, it's indicated that "The gradient of a function f: Rn -> Rm is a matrix of size m x n", i.e. the Jacobian matrix.

However, it's indicated in "Matrix Differential Calculus" by Jan R. Magnus and Heinz Neudecker, that "the transpose of the m x n Jacobian Matrix, i.e. an n x m matrix, is called the gradient...".

So, which one is correct?

Is the Jacobian matrix the gradient?
or is the transpose of the Jacobian matrix the gradient?

Pp.150 in Mathematics for Machine Learning: Screenshot 2023-06-25 at 1 01 57 AM

Pp.97 of Matrix Differential Calculus: Screenshot 2023-06-25 at 1 02 20 AM

mml-book / mml-book.github.io

Why the gradient is not defined as the transpose of the Jacobian Matrix, as indicated in "Matrix Differential Caculus"? #753