I split this up from #26 for clarity. Most feedback is quibbling about language choice, so I broke that up into sections by heading as well.

1. Tensors and shapes

Rank can be defined as the number of indices required to get individual elements of a tensor.

Matrix rank is a concept from linear algebra and has nothing to do with tensor rank.

I suppose using tensor order to distinguish from matrix rank is non-conventional? It's just a little confusing when linear algebra is fairly pertinent to a lot of machine learning.

A euclidian vector

A Euclidean vector?

1.2.1 reduction operations

sum(a, axis=0)

I understand that numpy is implied, but sum is also a built in function, and does not take axis as a keyword argument. Perhaps np.sum would be clearer?

I've not used Jupyter Book before -- if intersphinx works on it, it could be a really great way to auto-link to other library's documentation.

1.4 Modifying rank

In tensorflow and jax there is expand_dims 😠 You can also use reshape and ignore newaxis

I'm not really sure how to interpret this angry face... 😅 numpy has expand_dims too!

And I think jax.numpy.newaxis also works?

>>> import jax.numpy as jnp
>>> arr = jnp.arange(12).reshape((3, 4))
WARNING:absl:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
>>> arr[..., jnp.newaxis].shape
(3, 4, 1)
>>> arr[jnp.newaxis].shape
(1, 3, 4)
>>> arr.reshape((2, 6))
DeviceArray([[ 0,  1,  2,  3,  4,  5],
             [ 6,  7,  8,  9, 10, 11]], dtype=int32)

1.4.1 Reshaping

There is one special syntax element to shaping: -1 dimensions. -1 can appear once in a reshape command and means to have the computer figure out what goes there by following the rule that the number of elements in the tensor must remain the same. Let’s see some examples.

I found the second sentence hard to follow. Would some punctuation and formatting help?

-1 can appear once in a reshape command. It tells the computer to figure out what goes there by following the rule that the total number of elements in the tensor must remain the same.

Below I also suggest my personal understanding of -1.

-1 can appear once in a reshape command. Because the total number of elements in the reshaped tensor must remain the same, -1 stands for "everything else".

1.5 Chapter Summary

There are operations that reduce ranks of tensors, like sum or mean.

This can be a pain to do consistently, but for functions that are also English words I like to format functions in code-style (i.e. like sum or mean) to distinguish when I'm talking about code vs maths.

1.6.2 Reductions

Just a suggestion: many undergraduate courses check code function by providing some test cases and example answers. It might be nice to provide an example input-output pair in Markdown code so people can check their work. e.g.

input_arr = np.array([1, 10, 2, 3])
output_arr = np.array([0.0625, 0.625 , 0.125 , 0.1875])
assert np.all(normalize_vector(input_arr) == output_arr)

1.6.3 Broadcasting

write python code to compute their outter product.

outter -> outer

You have a tensor of unknown rank A and would like to subtract 3.5, and 2.5 from every element so that your output, which is a new tensor B, is rank of rank(A) + 1. The last axis of B should be dimension 2.

I had to read over this a few times before coming to an interpretation that accounted for every word. Is this what you're after?

>>> import numpy as np
>>> subtract_twice = lambda x: np.stack([x, x], axis=-1) - [3.5, 2.5]
>>> a = np.arange(24).reshape((6, 4))
>>> subtract_twice(a).shape
(6, 4, 2)
>>> subtract_twice(a.reshape((2, 3, 4))).shape
(2, 3, 4, 2)

If so, can I please suggest breaking up the question for more immediate understanding? It's more repetitive but people don't typically read textbooks for the prose.

You have a tensor of unknown rank A. You would like to perform two operations on it at once: firstly, subtracting 3.5 from every element; and secondly, subtracting 2.5 from every element. Your output should combine the results and should be a new tensor B with rank rank(A) + 1. Hint: the new axis of B should be last, and should have dimension 2.

(I originally sat wondering how subtracting 6 should give me dimension 2).

whitead / dmol-book