jipolanco / PencilArrays.jl

Distributed Julia arrays using the MPI protocol
https://jipolanco.github.io/PencilArrays.jl/dev/
MIT License
60 stars 8 forks source link

Allow unsorted list of decomposed directions #57

Closed jipolanco closed 2 years ago

jipolanco commented 2 years ago

This PR lifts the restriction that the decomp_dims argument to Pencil must be a list of sorted dimensions. For instance, decomp_dims = (1, 3) was allowed while (3, 1) was not.

In practice, this enables transpositions that weren't possible before. This is because, for two decompositions to be compatible for transpositions, their decomp_dims must differ by exactly one element (this is an algorithmic detail which basically simplifies the implementation and allows to minimise inter-process communications). So, decomp_dims = (2, 3) (data is local in x) has always been compatible with decomp_dims = (1, 3) (data is local in y). This PR enables the decomp_dims = (2, 1) (local in z) configuration which wasn't previously possible.

For example:

using MPI
using PencilArrays
using LinearAlgebra: transpose!
using Random

MPI.Init()
comm = MPI.COMM_WORLD
dims_global = (12, 14, 9)

# Create a 2D decomposition in which data is local in the `x` direction.
pen_x = Pencil(dims_global, (2, 3), comm)
ux = PencilArray{Float64}(undef, pen_x)
randn!(ux)

# Create another 2D decomposition in which data is local in `y`.
# (This has always been possible.)
pen_y = Pencil(pen_x; decomp_dims = (1, 3))
uy = PencilArray{Float64}(undef, pen_y)
transpose!(uy, ux)  # transpose between the two configurations

# Create a 2D decomposition in which data is local in `z`.
# To transpose from an `x`-local to a `z`-local decomposition, it was
# previously required to go through the intermediate `y`-local configuration.
# With this PR we can skip that step and do the following:
pen_z = Pencil(pen_x; decomp_dims = (2, 1))  # the order of elements in decomp_dims is important!
uz = PencilArray{Float64}(undef, pen_z)
transpose!(uz, ux)

Note that it should be possible to automate the selection of a "good" order of decomp_dims leading to compatible decompositions, but this is not done for now.