mratsim / Arraymancer

A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
https://mratsim.github.io/Arraymancer/
Apache License 2.0
1.34k stars 95 forks source link

Layout (C- or F-) inference leads to incorrect behaviour #660

Open darkestpigeon opened 4 months ago

darkestpigeon commented 4 months ago

Here is a simple example:

import arraymancer

let t = arange(0, 2*3)
let w = t.reshape(2, 3).permute(1, 0).flatten()
echo t
echo w

assert t != w

The assertion actually fails. The tensors are the same, while you'd expect the w tensor to be [0, 3, 1, 4, 2, 5].

The problem lies in the flatten (or equivalently reshape(2*3)) operation: if .is_F_contiguous check passes (and it does in this case), the operation just changes the shape without a copy. In reality we have a non-contiguous C-tensor, so a copy is needed to flatten it. Just because the strides happen to make it look like some contiguous F-tensor, it doesn't become one.

There are two solutions to this problem:

  1. Make the layout an attribute of the Tensor object and support both layouts. This way there will be no confusion, but many operations will have to be adjusted.
  2. Make all tensors C-tensors (i.e. with rowMajor layout). Actually this is already the case, but checks for F-continuity and layout options need to be eliminated from all operations (reshape included), as these are applied erroneously.

P.S. The layout defines the ordering of tensor elements and cannot be inferred from the strides. Suppose we have a 3-dim tensor t. If it has rowMajor layout, the ordering of the elements is (t[0, 0, 0], t[0, 0, 1], ..., t[0, 1, 0], t[0, 1, 1], ..., t[d1 -1, d2-1, d3-2], t[d1-1, d2-1, d3-1]). If the layout is colMajor, the ordering is (t[0, 0, 0], t[1, 0, 0], ..., t[0, 1, 0], t[1, 1, 0], ..., t[d1-2, d2-1, d3-1], t[d1-1, d2-1, d3-1]). This holds irrespective of how the tensor happens to lie in memory. And given the same strides, the tensor can be contiguous if it has one layout and discontiguous if it has the other.

Clonkk commented 1 month ago

AFAIK, the way to change a tensor layout is through initTensorMetadata and not by reshaping which will mostly change strides vector and not actual data.

What I get from this issue :