ahwillia / Einsum.jl

Einstein summation notation in Julia
MIT License
151 stars 17 forks source link

Support for offsets in indices #6

Open ntezak opened 8 years ago

ntezak commented 8 years ago

Hi, for my use-cases (quantum dynamics solvers) it would be super useful to be able to specify integer offsets in the indices. As an example

E.g., 
A = zeros(10);

X = randn(10);
Y = randn(10);

@einsum A[j] = X[j]*Y[j+3]
@assert isapprox(A, [X.*Y[4:end];zeros(3)])

Here the macro ought to infer that effectively A[1:7] = X[1:7] .* Y[4:10] and A[8:10] = 0. Taken just by itself it may not be obvious why this is more useful than just using subarrays but especially in combination with in-place operations (see issue #5) this could be quite powerful.

If this seems of general use, I would be happy to take care of the implementation.

ahwillia commented 8 years ago

There has been a bit of discussion about this in the past, see https://github.com/ahwillia/Einsum.jl/issues/2

I'm not opposed to this at all and think it could be quite nice. I think we need to think carefully about conventions. How would we handle the following:

@einsum A[j] := X[j+3]

Note that := specifies that A is not preallocated. Do we set length(A) == length(X) or length(A) == length(X)-3?

Do we also want to support this?

offset = 3
@einsum A[j] := X[j+$offset]
ntezak commented 8 years ago

Those are good questions!

  1. A conservative approach might be to start by only supporting pre-allocated array expressions. This would give most flexibility with least required code changes.
  2. I would say let's start only with static integer offsets known at the time of macro invocation. Writing more general code later is always an option, but when the offset (and specifically its sign) is known at compile time, it should lead to better performance.
ahwillia commented 8 years ago

A conservative approach might be to start by only supporting pre-allocated array expressions.

:+1: to this. I think this is a good place to start.

when the offset (and specifically its sign) is known at compile time, it should lead to better performance

Perhaps my use of $ is misleading. I was thinking:

@einsum A[j] := X[j+$offset]

Would expand to:

for j = 1:(length(X)-offset)
    A[j] = X[j+offset]
end

I.e. it would not expand to A[j] = X[j+3].

So I think this would be totally feasible and not hurt performance.

ntezak commented 8 years ago

Oh ok, gotcha, yeah, I think that should definitely be supported!

ahwillia commented 8 years ago

We should use :offset rather than $offset -- the second will try to interpolate which is not what we want.

ahwillia commented 8 years ago

I'm somewhat close to having this work. I still need to figure out conventions though.

@einsum A[j] := X[j + :offset]

What is the appropriate dimension check for this?

@assert size(A,1) == size(X,1) - offset
@assert size(A,1) >= size(X,1) - offset

Or we could opt for:

i_max = min(size(A,1), size(X,1) - offset)
for i = 1:i_max
    A[i] = X[i+offset]
end

I kind of like this last convention the best.

ahwillia commented 8 years ago

Closed by https://github.com/ahwillia/Einsum.jl/issues/9

ntezak commented 8 years ago

Hi, sorry for being out of touch! I think there are some issues with this current implementation:

  1. Due to the current allocation convention, it does not currently support negative RHS index offsets. I think in this case, the allocated array should be the minimum size necessary for all the LHS indices to make sense:

    X = randn(10)
    
    @einsum A[i] := X[i-5]
    @test size(A) == (10,)
    @test all(A[6:end] .== X[1:5])

    => currently fails

  2. More importantly: It does not support multiple (different) offsets for the same index on the RHS.

    X = randn(10)
    @einsum A[i] := X[i+3]*X[i-3]
    @test size(A) == (7,)
    @test isapprox(A[4:7], X[7:end].*X[1:4])

    => currently fails

I have a working branch here that addresses both of these. I can make a pull request anytime! ntezak/Einsum.jl@e2ef9c4405035158cce52e249b88df33d2a34bd9