Open dlfivefifty opened 1 year ago
Yes please!
Any thoughts on the following. Lets consider matrices. 4-tensor * matrix is not defined:
julia> n = 10; T² = randn(n,n,n,n); X = randn(n,n);
julia> T²*X
ERROR: MethodError: no method matching *(::Array{Float64, 4}, ::Matrix{Float64})
Should it be:
(T²*X)[k,j] == T²[k,j,:,:] * X
(T²*X)[k,j] == permutedims(T²[k,j,:,:]) * X
Option 2 seems like the "right way" to make in consistent. But perhaps I missed something. You would also need to make sense of tensor * vector.
I think the cleanest way to express tensor contractions is Einstein notation, since there's otherwise an ambiguity in which indices go together. This is implemented in e.g. Tullio.jl.
However, I guess (and hope!) that we would also support non-dense four-tensors, or lazy four-tensors (e.g. Kronecker products, etc), and then Tullio.jl will not work, since it assumes dense tensors (although it seems to support ranged indices, so the tensors need not be full).
The existing packages (Tullio.jl, Tensors.jl, TensorOperations.jl) seem to be solving a different problem then what we have: I think we only need tensor * tensor operations, and are really just in need of a "language" for this, where they are trying to do a variety of tensor operations efficiently. Einstein notation is certainly not the right answer for what we need.
Well, the problem as I see it, is that ad hoc definitions of higher-order tensor multiplication do not scale; what happens for e.g. six-tensor * three-tensor? I.e.
(S*T)[a,b,c] = S[a,i,b,j,c,k] * T[i,j,k]
or
(S*T)[a,b,c] = S[a,b,c,i,j,k] * T[i,j,k]
or even
(S*T)[a,b,c] = S[i,a,j,b,k,c] * T[i,j,k]
which makes more sense? To me, there is no obvious "right" answer.
Also, if you write your kernel as a lazy outer (Kronecker) product, and then apply it to a tensor using Einstein notation, it is fairly obvious how to optimize the implementation (illustrated using matrix–vector multiplication, but this scales):
K[i,j] = u[i]*v[j] # This should not be instantiated as a dense matrix, but lazily
z = rand(n)
w[k] = K[i,j]*z[j]
etc...
But maybe I'm missing your point?
Maybe there could be different versions of multiplication for different versions.
but for my needs just multiplying the last dimensions would suffice
Then implement the one you need — we can always generalize later. Maybe make some intermediary type hierarchy, so we can dispatch off of it?
OMG its so simple: a kernel K.(x, x')
is just represented
P*A*P'
where A
is the matrix of coefficients in the P
basis!
That is true, however it would be good if we can support general linear operators, since a dense matrix A
can be rather sizable.
As an example in AtomicStructure.jl, I have a kind of integral operator K
whose action on a test vector v
is K(x,x')*v(x') = a(x)*\int_0^\infty b(x')*v(x') dx'
, which I solve by solving auxiliary Poisson problems. This way, the computation reduces to linear complexity in the basis size, instead of quadratic if you were to precompute the matrix upfront.
That just corresponds to a rank-1 matrix so you can do that with structured matrices
Maybe I messed up the notation, but that is definitely a dense (non-local) operator.
Something like this:
rank-1 matrices are dense
We can probably generalise that code to be used for any AbstractQuasiMatrix * Basis
but I'll have to think a bit more about that
Want to do:
We have some experiments for 2D quasivectors. Here
K.(x, x')
is a quasi-matrix. Do we want to support bases as 4-tensors: i.e.T²[x,y,k,j]
so that ifX isa AbstractMatrix
we can writeK = T² * X
?