Add rows to matrix / Add element to Ndarray: shouldn't it add it to the existing object instead of returning a new one ?

mverzilli / crystalla

Crystal library for Numerical Methods. It binds to LAPACK and is unashamedly inspired by Numpy.

MIT License

45 stars 7 forks source link

Add rows to matrix / Add element to Ndarray: shouldn't it add it to the existing object instead of returning a new one ? #14

Closed MatthiasRMS closed 7 years ago

MatthiasRMS commented 7 years ago

I'm experimenting with a new Ndarray class and while creating a class method add_element, I took a look at how the add_rows method was coded for the Matrix class, and I noticed that it wasn't modifying the object, but was returning a new one.

Instinctively, I would expect that I could do:

matrix = Crystalla::Matrix.new([1.0, 2.0, 2.0, 3.0], 2, 2)
p matrix
>> Matrix[[ 1.0, 2.0 ],
       [ 2.0, 3.0 ]]
matrix.add_rows(1, Crystalla::Matrix.new([4.5, 3.0], 1,2))
p matrix 
>> Matrix[[ 1.0, 2.0 ],
       [ 4.5, 3.0 ],
       [ 2.0, 3.0 ]]

Instead, I need to assign it to a new instance to get the modified matrix: matrix2 = matrix.add_rows(1, Crystalla::Matrix.new([4.5, 3.0], 1,2))

Is there a particular reason why you made this choice ? For Ndarrays, it would seem more natural to do:

array = Crystalla::Ndarray.new([1.0, 2.0, 2.0, 3.0])
array.add_element(6.0, 1)
p array 
>> Ndarray [1.0, 6.0, 2.0, 2.0, 3.0]

Let me know what you think !

mverzilli commented 7 years ago

I initially chose to make all operations return new copies because it was safest, and a bit as an inheritance from Matlab, where in general you don't destruct matrices unless you reassign them.

That's why invert! has a bang, to signal that it is "destructive".

Also if you take a look at numpy's docs functions like insert and append return copies (https://docs.scipy.org/doc/numpy/reference/generated/numpy.insert.html#numpy.insert)

I'd say let's stick to returning copies by default. If you feel it's a bit bothering, we can add ! versions of methods (e.g.: add_rows!).

On a separate note: are you familiar with LAPACK? Just in case, keep in mind that LAPACK lays out its memory in column major order, so that may be the best way to keep data internally in ndarray.

MatthiasRMS commented 7 years ago

Makes sense ! I'll stick with non destructive methods for now, and as you said I'll maybe add their destructive equivalent with!. I noticed that the matrix class already implemented only supports Float64 as elements for the matrices. Wouldn't it be more flexible to have ndarrays compatible with both Int32 and Float64 ?

I couldn't find a way to make a class accept an array without specifying the type of elements held by it (it will raise an error asking to specify the type). So maybe a solution would be to have a Ndarray class, and 2 classes (Ndarray::Int32 and Ndarray::Float64) inheriting from it ? Do you see an alternative to keep only one class able to store both arrays of Int32 and arrays of Float64 ?

EDIT: or I can overload the constructor to accept both types

mverzilli commented 7 years ago

At this point I think it would be easier to discuss with some concrete code. Would you mind sending a PR with whatever you have? We can then have this conversation over the PR and evolve it as we discuss.

There are different approaches we could try. One would be to have NDArray be a generic, declaring it as NDArray(T) and then typing the underlying ivars as Array(T) (or whatever you need to hold the data in memory).

That way if you create your NDArray as NDArray.new([1.0, 2.0, 3.0]) Crystal will infer that it is an NDArray(Float64), whereas if you instantiate it as NDArray.new([1, 2, 3]) it'll be typed as NDArray(Int32).

MatthiasRMS commented 7 years ago

Just did in PR #15