Currently the interpolation step is performed on the target rank i.e. not on the source rank where interpolation points are stored. For large ratios of source/target ranks the interpolation step becomes a critical bottleneck. To resolve this we should support a different mode where the interpolation step is performed on the source rank.
Currently the interpolation step is performed on the target rank i.e. not on the source rank where interpolation points are stored. For large ratios of source/target ranks the interpolation step becomes a critical bottleneck. To resolve this we should support a different mode where the interpolation step is performed on the source rank.