gorgonia / tensor

package tensor provides efficient and generic n-dimensional arrays in Go that are useful for machine learning and deep learning purposes
Apache License 2.0
362 stars 49 forks source link

Dense Iterator off by 1 #16

Open kaarrot opened 6 years ago

kaarrot commented 6 years ago

It appears that in 3x3x3 tensor while iterating the iterator Start is not pointing to the first element so that all values are off by one index.

⎡26   0   1⎤
⎢ 2   3   4⎥
⎣ 5   6   7⎦

⎡ 8   9  10⎤
⎢11  12  13⎥
⎣14  15  16⎦

⎡17  18  19⎤
⎢20  21  22⎥
⎣23  24  25⎦

CODE:

import (
    . "fmt"
    t "gorgonia.org/tensor"
)

func main() {
    a := t.New(t.WithShape(3, 3, 3), t.WithBacking(make([]int, 3*3*3)))
    it := t.IteratorFromDense(a)
    for i, err := it.Start(); err == nil; i, err = it.Next() {
        a.SetAt(i, it.Coord()[0], it.Coord()[1], it.Coord()[2])
    }
    Println(a)
}
chewxy commented 6 years ago

OK. Will investigate.

P/S: you can just do this a.SetAt(i, it.Coord()...)

kaarrot commented 6 years ago

Thanks, this makes it more readable. So, it turns out tensor/iterator_test.go expects having [0,0,0] index at the very end.

var correct = [][]int{
        {0, 0, 1},
        {0, 0, 2},
                ...
        {0, 0, 0},
    }

Probably this is related to some implementation details I don't quite understand at the moment.

chewxy commented 6 years ago

Yeah. So think of it as this:

A has (2,3,4). That means the MAXIMUM coordinate is (1,2,3) - it helps if you think of the maximum value being A[1][2][3]. As a flat slice, the maximum index is 23, given that the slice would have 24 elements.

Assuming that it's row-major with standard strides (12,4,1), the next increment of an index would cause this to happen:

Step Coordinate Index Action
-2 A[1,2,2] 22 Increment Coordinate for Coord[2] => (2->3)
-1 A[1,2,3] 23 Increment Coordinate for Coord[2] => (3->4)

But the last action cannot be done because the maximum size of the coordinate is 4. Coordinate cannot be greater than 3.

So the algorithm will try a "carry", like you would carry a 1 when you add 9+1 = 10. So then the algorithm will try A[1, 3, 0], which will then also cause an error, then it will try A[2,0,0,], which will cause an error, there fore it will try the last carry, causing it to go to [0,0,0]

Does that help?

kaarrot commented 6 years ago

Ok, but why thefunc (it *FlatIterator) Start()returns advanced iterator instead of just setting it to 0? Starting from 0 and returning it.nextIndex,nil instead of return it.lastIndex, nil in func (it *FlatIterator) ndNext() I think aligns things. Well at least the raw data seems to be correct.

Printf("%v", my_tensor.Data()) Before: [26 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25] After: [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26]

Unfortunately I'm still having issues with formatted output:

⎡ 0   1   2   3⎤
⎢ 4   5   6⎥
⎣ 7   8   9⎦

⎡10  11  12⎤
⎢13  14  15⎥
⎣16  17  18⎦

⎡19  20  21⎤
⎢22  23  24⎥
⎣25  26   0⎦

Figuring out right index there is pretty daunting.

chewxy commented 6 years ago

I'm currently trying to work thru your comment. Will take a while. Things I'll definitely check on is the weird formatting issue. Can you share the code that created that?

kaarrot commented 6 years ago

Yes absolutely. https://github.com/kubaroth/tensor/tree/iterator-off-by-one With some attempts to fix formatting (The spacing is not there yet)

I used the following snippet for test the tensor's output (the tensor.Data() and formatted): https://gist.github.com/kubaroth/b6a77619fbd35d7d95b04818b0005924#file-dense_iter-go

A separate issue I run into in my tests was that I had to work locally, directly in gorgonia/tensor. Apparently When importing a forked package gives me the following error: can't load package: package github.com/kubaroth/tensor: code in directory /src/github.com/kubaroth/tensor expects import "gorgonia.org/tensor". How should I fix it without messing around with the imports in all the files?

chewxy commented 6 years ago

the common trick to that is to create the gorgonia.org/tensor directory, and then git clone your repo into that directory.

That's really more of a Go issue than anything. What I personally do is I checkout upstream branches from that dir

chewxy commented 6 years ago

OK.. so I had a look at it. Your code could be fixed by this:

import (
    . "fmt"
    t "gorgonia.org/tensor"
)

func main() {
    a := t.New(t.WithShape(3, 3, 3), t.WithBacking(make([]int, 3*3*3)))
    it := t.IteratorFromDense(a)

    // make a copy of current coord
    current := make([]int, 3)
    copy(current, it.Coord())
    for i, err := it.Start(); err == nil; i, err = it.Next() {
        a.SetAt(i, current...)
        copy(current, it.Coord())
    }
    Println(a)
}

I'm gonna think about having a store for current coord in iterators

kaarrot commented 6 years ago

Although it is not immediately obvious why the copy is needed there (from client's perspective) this solution works for my current needs. Thanks. Long term, would be nice to avoid this extra copy.

chewxy commented 6 years ago

The solution for the iterators to return current coordinates would be to also have a copy of the current coordinate. Copies are a lot cheaper than the % that is required to be done to do the calculation of location

chewxy commented 6 years ago

I'd love some input on what should be a better option

kaarrot commented 6 years ago

Are you considering the alternative option in the attached link?

chewxy commented 6 years ago

which link? (sorry for being dense)

kaarrot commented 6 years ago

https://github.com/kubaroth/tensor/tree/iterator-off-by-one