j3-fortran / fortran_proposals

Proposals for the Fortran Standard Committee
175 stars 14 forks source link

Array broadcasting in expressions #252

Open egiovan opened 2 years ago

egiovan commented 2 years ago

Array broadcasting is very common in Python with numpy and in Matlab.

Boradcasting means that in an array expression if an array has a unitary dimension along an axis that dimension is automatically expanded to match the size of the other array. For example if an array has size [3,1] and another has size [1,4] both array are expanded as they have size [3,4]. In the actual Fortran one has to use spread to achieve the same result. Another nice feature of numpy is the possibility to add unitary dimensions on the fly with None. Python example:

a = np.ones(2)
b= np.ones(3)
a[:,None] * b[None,:]

Moreover some functions that reduce the dimensionality like sum can have an extra arguments keepdims that doesn't reduce the dimensionality but make that dimension unitary. This is useful as one can write (in Python) somenthing like: a/np.mean(a, axis=0,keepdims=True) So I suggest the following feature: Use the symbol "+" (or whatever other symbol) to add a dimension to a section of an array. The function sum and other that reduce the dimensionality should have an extra keyword argument like keepdims that doesn't eliminate that dimension. As an example:

integer :: a(3), b(4), c(3,4), d(4), i
a = [(i=1,3)]
b = [(i=1,4)]
c = a(:,+) + b(+,:)
d = c/sum(c, dim=1, keepdim=.true.)

Contrary what happen to Matlab and Python adding the extra dimension is obligatory (no implicit unary dimensions).

certik commented 2 years ago

@egiovan good idea especially about making it explicit. Do you know if this approach would allow all NumPy's broadcasting features? Or is there still some broadcasting operation in NumPy that your proposal still would not allow?

egiovan commented 2 years ago

As far as I know it is all you need.

Numpy has also functions that returns broadcasted views of list of arrays like:

https://numpy.org/doc/stable/reference/generated/numpy.broadcast_arrays.html

I think the broadcasted view can be obtained with a stride equal to zero. But it would be a mess as it cannot be definable, in python there is a warning when you try to assign values to broadcasted views. so I'm contrary to import a feature like this one.

While sections with added unitary dimensions are perfectly safe, definable, etc.

The point is that you may not know, in general, if an expression will involve broadcasting at compile time (except for those simple examples), so at run time the processor has to check the extent of all the arrays involved in the array expression, and that may affect the speed. On the other hand it may be safer to require the processor to report if in array expressions the dimensions are not compatible. It may make the program a little slower but safer (with an explicit loop the burden should be all on the shoulder of the programmer).

egiovan commented 2 years ago

To make things more explicit one can add an attribute to arrays, like broadcastable which indicates that any unitary dimension of that array can be expanded. A section with an added dimension with "+" like in the previous example will be broadcastable, and the return of a function like the sum with keepdim will be broadcastable. All other array expressions can remain as before, without the overhead to check the extent of all the involved arrays. Broadcastable pointers may have additional unitary dimension added in pointer assignments, like in

ptr(:, +) => b

klausler commented 2 years ago

SPREAD can be implemented without duplicating data. This proposal seems to me to boil down to an automatic application of a non-duplicating SPREAD. It might be better to use SPREAD and insist on better implementations.

egiovan commented 2 years ago

I think it will also be clearer. For example:

integer :: a(2), b(3), c(4), d(2,3,4)

d = a(: , + , +) * b(+, : ,+) * c(+, +, :) 

I wouldn't use spread in that case, it will be a mess. I would use a "do concurrent" in actual Fortran

integer :: i, j, k
do concurrent(i=1:2, j=1:3, k=1:4)
   d(i, j, k) = a(i) * b(j) * d(k)
enddo