j3-fortran / fortran_proposals

Proposals for the Fortran Standard Committee
175 stars 14 forks source link

Dimension-based masks #265

Open perazz opened 2 years ago

perazz commented 2 years ago

One limitation in current Fortran's array handling I find pretty annoying is the inability to employ logical masks which only represent one dimension of the array, instead of the whole one. I think this limitation makes all intrinsic functions de facto useless when handling multidimensional arrays.

Here's an example. I have a table storing a list of quantities over two dimensions:

integer, parameter :: nFruits = 10
integer, parameter :: nMonths = 12
real(real64), dimension(nFruits,nMonths) :: fruit_harvest

Now, assume I want to do some non-trivial operations with masking, like counting all fruits of some class over the summer months (OK the conditions in this example could be replaced by index-based slicing, but let's assume they're not)


integer :: j
logical, parameter :: summer(*) = [(j>=6 .and. j<=8,j=1,nMonths)]
logical, parameter :: berries  (*) = [(j>=7 .and. j<=10,j=1,nFruits)]

! Currently only way to do this is to duplicate the summer mask
summer_harvest = sum(fruit_harvest, dim=2, mask = SPREAD(summer, dim=1, ncopies=nFruits))

! And then filter that again
summer_berries = sum(summer_harvest,dim=1,mask=berries)

Everything would be far simpler if the array intrinsics had an option to allow the logical mask to correspond to one of the array dimensions (potentially even more than one mask could be specified). For example


! mask could either be have same array size or the one specified by [dim]
summer_harvest = sum(fruit_harvest, dim=2, mask=summer)

! mask could either be have same array size or the one specified by [dim]
summer_berries = sum(fruit_harvest, mask(1)=berries, mask(2)=summer)
certik commented 2 years ago

Thanks @perazz. For comparison and inspiration, how would this be done in NumPy (if it has this feature)?

perazz commented 2 years ago

If we take the pack intrinsic as example, NumPy's has two equivalent functions:

certik commented 2 years ago

For efficient implementation in a compiler, do you think these can still be implemented as regular Fortran functions (and thus go to stdlib as a start), or does the compiler have to have special knowledge of them in order to be able to optimize them?