How can I perform star contraction?

GiggleLiu commented 6 years ago

In Einsum.jl

@einsum n[i,j,k] = hg[i,m]*hg[j,m]*hg[k,m]

defines a star shaped contraction, which will produce a rank 3 tensor.

However, @tensor in TensorOperations.jl will raise an error. How can I define such kind of contraction correctly?

GiggleLiu commented 5 years ago

Here is a case it does not raise an error, but its wrong...

using TensorOperations

K = 0.0
D = 2
inds = 1:D

M = [sqrt(cosh(K)) sqrt(sinh(K)); 
     sqrt(cosh(K)) -sqrt(sinh(K))]

T = zeros(Float64, D, D, D, D)
for i in inds, j in inds, k in inds, l in inds
    for a in inds
        T[i, j, k, l] += M[a, i] * M[a, j] * M[a, k] * M[a, l]
    end
end

@tensor TT[i, j, k, l] := M[a, i] * M[a, j] * M[a, k] * M[a, l]
@show TT
@show T

Its output is

TT = [4.0 0.0; 0.0 0.0]

[0.0 0.0; 0.0 0.0]

[0.0 0.0; 0.0 0.0]

[0.0 0.0; 0.0 0.0]
T = [2.0 0.0; 0.0 0.0]

[0.0 0.0; 0.0 0.0]

[0.0 0.0; 0.0 0.0]

[0.0 0.0; 0.0 0.0]

Jutho commented 5 years ago

TensorOperations only accepts strict Einstein summation, i.e. every summation index should only appear exactly twice. The fact that you don't get an error is indeed a bug in TensorOperations; one that I will have to look into.

GiggleLiu commented 5 years ago

Could you please also consider implementing a fallback to a general einsum? I think current Einsum.jl is not only poorly maintained, but also having poor performance. See https://github.com/ahwillia/Einsum.jl/issues/1#issuecomment-420921663

Jutho commented 5 years ago

The problem with generalizing the index notation is that it is not really all that well defined. I have a vague plan to implement a more general / completely different index notation macro based upon the broadcasting and reduction functionality in https://github.com/Jutho/Strided.jl, but to give an example of things that are confusing:

if A[i,j,k] = B[a,i]*B[a,j]*B[a,k] works, then probably also A[i] = B[i,a] should work, and just do a reduction over the first index, i.e. the latter is equivalent toA = sum(B, dims = 2). But now, what does A[i] = C[i] + B[i,a] do ? Is the sum over a constrained to the second term, i.e. is it like A = C + sum(B, dims=2)`. Or is it more like

for i = ....
   for a = ...
      A[i] += C[i] + B[i,a]
   end
end

i.e. A = sum(C .+ B, dims = 2) such that also C is added to A a number of times equal to the range of index a.

GiggleLiu commented 5 years ago

I see, so you mean Einsum is well defined for pure contraction/broadcasting/reduction operations but ill defined for linear algebra, right?

About your package Strides.jl, the idea is interesting and the performance is amazing. It is also a good place for general Einsum like contraction/broadcasting/reduction.
I will keep an eye of that project, thank you for introducing me such a useful package.

Jutho commented 5 years ago

In light of your recent work with @under-Peter I thought some more about 'star contractions', but I still don't understand its importance. You don't encounter these types of contractions in performance sensitive parts of tensor network algorithms, right? Only at construction phase of, say, encoding a classical partition function as a tensor network. Then why not just create a delta-tensor?

As a physicist, my other problem with the concept, aside from the ambiguities it introduces in the notation as indicated above, is that it is not basis dependent, i.e. what is a star contraction in one choice of basis is something completely different (namely contraction with the basis transformed delta tensor) in a different basis. As a consequence, the concept of star contraction does not combine well (at all?) with efforts to encode explicit symmetries in tensor networks.

GiggleLiu commented 5 years ago

Thanks for looking into it. Agree that we don't encounter these types of contractions in performance sensitive parts of tensor network algorithms, and it is also true that any einstein notation can be converted to tensor neworks contractions by introducing delta-tensors.

But it really depends on applications, the above statements are only true for tensor network algorithms. In machine learning and quantum computing world, factor graphs plays a important role in probability graph, it is equivalent to einsum. Tensor network to factor graph is like simple graph to hypergraph. In hypergraph, a multiple edge (an index that appear in multiple tensors) should be converted to delta tensors, this process is not as trivil as the case in star contraction, for example, in the following contracition of a quantum circuit

https://github.com/GiggleLiu/tensors_tutorial_jizhi/blob/master/notebooks/qc_tensor_mapping.ipynb

@tensoropt begin
    out[f,g,h,i,j] :=
    ψ[a,b,c,d,e] * 
    H1[a,f1] * H2[f2,b1] * H3[f3,c1] * H4[f4,d1] * H5[f5,e1] *
    H1[b2,g1] * H2[g2,c2] * H3[g3,d2] * H4[g4,e2] *
    H1[c3,h1] * H2[h2,d3] * H3[h3,e3] *
    H1[d4,i1] * H2[i2,e4] *
    H1[e5,j] *
    δ3[b2,b1,b] *
    δ4[c3,c2,c1,c] *
    δ5[d4,d3,d2,d1,d] *
    δ6[e5,e4,e3,e2,e1,e] *
    δ3[i2,i1,i] *
    δ4[h3,h2,h1,h] *
    δ5[g4,g3,g2,g1,g] *
    δ6[f5,f4,f3,f2,f1,f]
end

It is equivalent to

out2 = ein"abcde,af,fb,fc,fd,fe,bg,gc,gd,ge,ch,hd,he,di,ie,ej->fghij"(ψ,H1,H2,H3,H4,H5,H1,H2,H3,H4,H1,H2,H3,H1,H2,H1);

Apparently, introducing a rank-6 tensor is not efficient. This is the paper about mapping a QC to a probability graph https://arxiv.org/abs/1712.05384

About not basis dependent, I don't really understand what does it mean, it sounds that when considering good quantum numbers, it is hard to assign a direction to a hyper graph edge, makes it hard to implement symmetries, right?

Note: current star contraction in OMEinsum is not specialized, the performance is same as the fallback implementation (15 times slow than pytorch on GPU). But as you mentioned, its performance is not so important for the TensorNetworkAD project.

Jutho commented 5 years ago

I could probably provide a more efficient implementation with Strided.jl.

Basis dependence just means that, if you have some contraction X[a,b,c] = A[i,a]*B[i,b]*C[i,c] this only holds in that particular basis in which you choose to express the entries of the objects A, B, and C. If you would do a basis transform of the first dimension, such that A[i,a] -> O[i,j]*A[j,a] with O some orthogonal matrix, you end up with X[a,b,c] = O[i,j]*O[i,k]*O[i,l] *A[j,a]*B[k,b]*C[l,c] and so O[i,j]*O[i,k]*O[i,l] is not a delta tensor. For a normal contraction (only two indices i), it would still be the identity, if O is orthogonal. In the case of unitaries or general linear transformation, you still need to distinguish between covariant and contravariant indices (bra and ket) and only contract between the former and the latter, in order to have a basis independent expression.

I'll try to read a bit about factor graphs.

GiggleLiu commented 5 years ago

Cool. Thanks for your clear explaination :heart: , learnt from your post.

I'll try to read a bit about factor graphs.

Section 8.4.3 of this book would be a good starting point

http://users.isr.ist.utl.pt/~wurmd/Livros/school/Bishop%20-%20Pattern%20Recognition%20And%20Machine%20Learning%20-%20Springer%20%202006.pdf

Jutho / TensorOperations.jl

How can I perform star contraction? #50