Mojo-Numerics-and-Algorithms-group / NuMojo

NuMojo is a library for numerical computing in Mojo 🔥 similar to numpy in Python.
Apache License 2.0
86 stars 15 forks source link

NDArray with Row & Column Major ordering #39

Closed shivasankarka closed 2 months ago

shivasankarka commented 2 months ago

NDArray.🔥

Others

Notes:

forFudan commented 2 months ago

In NumPy, if an 2x3x4 array is sliced to produce 2x3x1 array, NumPy prints it as 2x3x1 array, but our current code prints it as 2x3 array. I believe this looks more intuitive and maintains the parity between row, columns major arrays, thus making it less confusing when switching between the two.

It is quite an interesting discussion point. Reducing the dimension may be more intuitive, but also may lose the dimension information. In numpy, the sliced array has the same ndim with the old one.

Considering the following matrix

[[ 2, 2, 2 ]
 [ 2, 3, 4 ]
 [ 2, 4, 4 ]]

A[:, 1] and A[1, :] would generate the same vector (1-D array) if the dimension is reduced. But some people would prefer to see 3x1 and 1x3 matrices (column and row vectors).

[[ 2 ]
 [ 3 ]
 [ 4 ]]
[[ 2, 3, 4 ]]
shivasankarka commented 2 months ago

@forFudan @MadAlex1997 I went through numpy documentation, it seems like they support both actually. So consider the following code

>> arr = np.arange(1, 10, step = 1).reshape(3, 3)
>> print(arr[1, :]) # This prints the same as our code with shape [3]
[4 5 6]
>> print(arr[:, 1]) # This prints the same as our code with shape [3]
[2 5 8]

But when you slice with slices instead of a mixture of integer and slices like,

>> arr = np.arange(1, 10, step = 1).reshape(3, 3)
>> print(arr[1:2, :]) # This prints it as if its a 1x3 array 
 [[4 5 6]]
 >>> print(arr[:, 1:2]) # This prints it as if its a 3x1 array
[ [2]
  [5]
  [8] ]

Currently we support both *Slice and also Variant[Slice, Int], so we could change the behaviour such that *Slice prints it as 3x1 or 1x3 array while Variant[Slice, Int] maintains the current behaviour of printing shape [3] array.

I partially agree with @forFudan and think we should implement both the style of slice calculation and printing and maintain similar behaviour as numpy. But it seems to break the symmetry and make it more confusing, so I am not sure what's the best way to go about it.

What do you all think? We always don't have to keep same convention as numpy, but it's good in some places. So If you think we could improve on behaviours, we should do that.

forFudan commented 2 months ago

What do you all think? We always don't have to keep same convention as numpy, but it's good in some places. So If you think we could improve on behaviours, we should do that.

I agree that we do not need to always keep the same convention as numpy (but parameter names can be the same to reduce the cost of migration). So let's keep it as it is and leave it to the future. :D