EricForgy / TensorAlgebra.jl

MIT License
4 stars 2 forks source link

Linear Algebra pedagogy #2

Open goretkin opened 4 years ago

goretkin commented 4 years ago

It might be outside the scope of this package, but if there were examples explained in terms of more familiar linear algebra, that would be so, so lovely.

I tried to give examples of what I think are analogies to linear algebra in the @show _ == _ statements below.

using TensorAlgebra: VectorSpace, Covector, ⊗, Tensor
using LinearAlgebra: dot

V = VectorSpace(:V, Float64)
v = Vector(V,[1,2,3])
α = Covector(V,[1,2,3])
# expected error
@show v(v)
@show α(α)

V = VectorSpace(:V, Float64)
W = VectorSpace(:W, Float64)

a = Float64[1,2,3]
b = Float64[1,2,3,4]
α = Covector(V, a)
β = Covector(W, b)
T = Float64[1 2 3 4;5 6 7 8;9 10 11 12]
t = Tensor((V⊗W)^*, T)

@show dot(T, a * b') == t(α⊗β)

@show (a' * T * b) == t(α, β)
hack1(x) = x'
@show collect(a' * T) == collect(hack1(t(α, -))) # I don't think there is a way to define `collect` such that `hack1` is not necessary, but I also don't know how to best make this analogy
@show collect(T * b) == collect(t(-, β)) # maybe change definition of `collect` to not require `hack1`

And then the manual could even show where the analogy cannot go any further, since there are only two "sides" you can multiply a "2D matrix" by.

EricForgy commented 4 years ago

Hi @goretkin 👋

Just seeing this now. I saw your other issue #1 and apologized there, but I'll apologize here too since I feel bad.

The package was broken when you tried it. I have some decent tests, but was in a rush after the last round of changes and foolishly announced the package after some last-minute changes, but didn't run the tests.

You are absolutely right to expect v(v) and c(c) to error and it should have errored. I've fixed it now and all the tests pass. You might want to have a look at the tests. That might actually help a little bit.

https://github.com/EricForgy/TensorAlgebra.jl/blob/master/test/runtests.jl

AND... thank you for your feedback. I'll think about how to relate this to more familiar linear algebra.

Until then, here are a few things to take note of...

A matrix M is a tensor Tensor(V^*⊗W,marray) for some marray::Array{Float64,2}.

A vector v is a tensor Tensor(V^*,varray) for some varray::Array{Float64,1}.

A covector α is a tensor Tensor(V,alphaarray) for some alphaarray::Array{Float64,1}.

You can think of vectors as "column matrices" and you can think of covectors as "row matrices". Note, a row matrix can be defined as a row matrix and does not need to be the transpose of some column matrix. This is important.

If we want to represent a vector v as a column matrix, we can write [v] to make it clear.

If we want to represent a covector α as a row matrix, we can write [α] to make it clear.

Similarly, if we want to represent M as a 2D array of number, we can write [M] to make it clear.

With that in mind, it might help to write

α(v) = [α][v] # row matrix times a column matrix

and

M(-,v) = [M][v] # matrix times a column matrix resulting in a column matrix
M(α,-) = [α][M] # row matrix times a matrix resulting in a row matrix
M(α,v) = [α][M][v] # row matrix times a matrix times a column matrix resulting in a scalar

I'm not super happy to write that down because the analogies only go so far, but maybe it helps.

For instance

v(α) !== [v][α]

obviously, but in the docs, I mention that we identify the "double dual" of a vector with itself because we can define

v**(α) := α(v)

and that constitutes a natural transformation.

I hope this helps and sorry again the package was broken when you tried it. Thank you for submitting the issues to bring it to my attention 🙏😊

goretkin commented 4 years ago

Hi @EricForgy , thanks for the reply! And thank you for this package!! I think it's great that the package has tests already.

I hope it helps to say that there's no need to apologize!

goretkin commented 4 years ago

It's fair if your answer is [a more polite version of] "go read a book", but I wanted to write down some questions.

You can think of vectors as "column matrices" and you can think of covectors as "row matrices". Note, a row matrix can be defined as a row matrix and does not need to be the transpose of some column matrix. This is important.

is this some statement about the absence of a structure that is present in linear algebra, in which every row matrix can be seen as the transpose of some column matrix (right?)?

If we want to represent a vector v as a column matrix, we can write [v] to make it clear.

I'm wondering if this concept could be somehow reified with an actual operation. Probably not [...] in julia, because that should be reserved for meaning "make a length-1 Vector". Whatever the name of the operation is, I think it would also apply to other "matrix representations" like

https://en.wikipedia.org/wiki/Complex_number#Matrix_representation_of_complex_numbers https://en.wikipedia.org/wiki/Dual_number#Linear_representation https://en.wikipedia.org/wiki/Quaternion#Matrix_representations

possibly totally outside the scope of this package to define those methods, but what I dream of is that they could all be written as methods of some generic function.

Perhaps what tensor algebra writes as [α], [v], [M], could be written as represent(Array, v) or `represent(Matrix, v). At the risk of totally going off the rails here,

represent(Array, α) could produce an a1::LinearAlgebra.Adjoint{T,Array{T,1}} and represent(Matrix, α) could produce aa2::Matrix{T}. Both would havesize(a) = (1, length(α))`

For instance v(α) !== [v][α], obviously

To check, it's obviously the case because the right-hand side is an "outer product" and evaluates to a matrix / something with tensor-rank 2, whereas the left-hand side corresponds to an inner product and evaluates to a scalar / something with tensor-rank 0.

Can the blank in _____ == [v]'*[a] be filled with something in the tensor language?

"double dual" of a vector

Is it a vector that has a dual, or is it a vector space, or is it both?

I mention that we identify the "double dual" of a vector with itself

And is it instructive or beneficial to not identify the double dual of a vector space with the vector space itself? I guess this might be hard to answer in the same way that it can be valid to see the integers as a subset of the rational numbers, but also valid to say that the element type of those sets are different, so it is either 1. a type error to even ask the question, 2. the answer is no. In Julia terms, it's a question of isequal vs ===. And so

julia> ((V^*)^*) == V
true

julia> ((V^*)^*) === V
true

perhaps a better question is, is there any benefit to making ((V^*)^*) === V false, or really should they be egal?

I recognize these are basic questions ("go read a book"), for which I'm sorry. I guess at least the issue title is honest about the agenda to leverage someone's (uh, me) linear algebra knowledge (including knowledge that is perhaps kind of sloppy because it equivocates things that should be seen as distinct) so that they can use this package to actually write clearer code.

EricForgy commented 4 years ago

It's fair if your answer is [a more polite version of] "go read a book", but I wanted to write down some questions.

Hah! I wouldn't do that 😊

I'm happy to answer questions if I can.

is this some statement about the absence of a structure that is present in linear algebra, in which every row matrix can be seen as the transpose of some column matrix (right?)?

Yes. In the standard library, LinearAlgebra, they have two covector-like types:

  1. Transpose
  2. Adjoint

However, not all covectors arise from the transpose / adjoint of a vector. I discussed this a bit in the issue

One important example that comes to mind is the differential

df = (df/dx) dx + (df/dy) dy + (df/dz) dz

with

[df] = [df/dx, df/dy, df/dz]

a row vector. LinearAlgebra currently does not have a proper covector type.

I'm wondering if this concept could be somehow reified with an actual operation. Probably not [...] in julia, because that should be reserved for meaning "make a length-1 Vector". Whatever the name of the operation is, I think it would also apply to other "matrix representations" like

https://en.wikipedia.org/wiki/Complex_number#Matrix_representation_of_complex_numbers https://en.wikipedia.org/wiki/Dual_number#Linear_representation https://en.wikipedia.org/wiki/Quaternion#Matrix_representations

possibly totally outside the scope of this package to define those methods, but what I dream of is that they could all be written as methods of some generic function.

This is a great question and not really out of scope 😊

The first two examples, complex numbers and dual numbers, are fields and this package should be able to handle them directly and a matrix representation should not be required. Quarternions form a ring, so that gets a little out of scope for tensor algebra. A tensor with values in a ring is a "module". It could be nice to add modules / matrix-valued tensors to this package. We'll see 😊

Perhaps what tensor algebra writes as [α], [v], [M], could be written as represent(Array, v) or `represent(Matrix, v). At the risk of totally going off the rails here,

Not off the rails 😊

The method I use for this is collect. In pseudocode, we could write:

[v] = collect(v)

collect returns an array form of the tensor, which is just an nD array of numbers.

represent(Array, α) could produce an a1::LinearAlgebra.Adjoint{T,Array{T,1}} and

This hurts my eyes a little bit 😅

This is kind of what I was talking about above. LinearAlgebra doesn't have a first-class covector, so we need to jump through some hoops like that ☝️

To check, it's obviously the case because the right-hand side is an "outer product" and evaluates to a matrix / something with tensor-rank 2, whereas the left-hand side corresponds to an inner product and evaluates to a scalar / something with tensor-rank 0.

This is pretty unfortunate in my opinion. [v][α] absolutely is not "outer product" and I think Julia (we) made a mistake. Outer product would be [v]⊗[α]. We should not identify * with because there is another natural use for * that we can't use now. I hope this mistake can be corrected, but I doubt it. It is kind of a blemish. This is discussed in the two Github issues I mention in the Discourse announcement and in the docs.

Is it a vector that has a dual, or is it a vector space, or is it both?

Great question 😊

A vector space V always has a dual V^* (as far as I know - surely for finite dimensional vector spaces). But for a vector v in V, v only has a unque dual covector if V has a "metric", i.e. dot product (which is often, but not always the case).

And is it instructive or beneficial to not identify the double dual of a vector space with the vector space itself?

Another great question. If you keep digging, you find that asking if two things are the "same" is kind of tricky. Is "same" the "same" as "equivalent" (down the rabbit hole we go!).

Have a look here and the surrounding material for some fun reading 😊

https://en.wikipedia.org/wiki/Natural_transformation#Double_dual_of_a_vector_space

Since (V^*)^* is naturally isomorphic to V, it is good enough as "same" for me 😊

I hope this helps 😊