crystal-data / num.cr

Scientific computing in pure Crystal
MIT License
151 stars 12 forks source link

Transpose before matmul causes malformed result #33

Closed jtanderson closed 4 years ago

jtanderson commented 4 years ago

This is a bit of an odd one :) Here is some code to first create a 3-by-2, transpose it, then scale each column. However, the middle one gets swapped. This only happens when transpose is involved; when the 2-by-3 is created directly, things function correctly.

a = [[0,1], [0,1], [0,1]].to_tensor.astype(Float64)
puts a
#  [[0, 1],
#  [0, 1],
#  [0, 1]]

a = a.transpose
puts a
# [[0, 0, 0],
#  [1, 1, 1]]

b = Tensor.diag([1,2,3].to_tensor.astype(Float64))
puts b
# [[1, 0, 0],
#  [0, 2, 0],
#  [0, 0, 3]]

puts a.matmul(b)
# [[0, 2, 0],
#  [1, 0, 3]]

## Expected output:
# [[0, 0, 0],
#  [1, 2, 3]]

Again, this works correctly when a = [[0, 0, 0], [1, 1, 1]].to_tensor.astype(Float64).

jtanderson commented 4 years ago

Some interesting things to note:

christopherzimmerman commented 4 years ago

Well, I'm glad that someone is actually using the library so I can catch all of these things.

MCVE:

a = [[0,1], [0,1], [0,1]].to_tensor.astype(Float64)
puts a.flags
# Contiguous | OwnData | Write

a = a.transpose
puts a.flags
# Fortran | OwnData | Write

a = a.dup
# Fortran | OwnData | Write

dup will retain memory layout if possible. I do a dup in the multiplication if a matrix isn't c-style contiguous, but I needed to enforce an output memory layout, which you can do via dup(Num::RowMajor)

Better yet, I should allow either input matrix to be transposed, since BLAS can handle that, which would avoid a copy.

There's another bug present here. A transpose doesn't actually own its own data, so the flag needs to get updated.

christopherzimmerman commented 4 years ago

Fixed on master

a = [[0,1], [0,1], [0,1]].to_tensor.astype(Float64)
puts a
#  [[0, 1],
#  [0, 1],
#  [0, 1]]

a = a.transpose
puts a
# [[0, 0, 0],
#  [1, 1, 1]]

b = Tensor.diag([1,2,3].to_tensor.astype(Float64))
puts b
# [[1, 0, 0],
#  [0, 2, 0],
#  [0, 0, 3]]

puts a.matmul(b)
# [[0, 0, 0],
#  [1, 2, 3]]