Note: this PR came out of a collaboration between ECMWF and Nvidia. We hope to have it merged at some point, but until then I'll leave it as a draft PR.
TODO:
Instead of using EXEC_EFFTW, redefine EXEC_FFTW. EXEC_EFFTW should technically only be used for the limited area version of ecTrans, etrans. @dmitrypek only used this because it happened to already have the correct array dimension ordering.
Implement a similar modification of the inverse transform.
This is to avoid implicit transpositions and associated unnecessary memory copies.
Work carried out by Dmitry Pekurovsky dpekurovsky@nvidia.com.
Note: this PR came out of a collaboration between ECMWF and Nvidia. We hope to have it merged at some point, but until then I'll leave it as a draft PR.
TODO:
EXEC_EFFTW
, redefineEXEC_FFTW
.EXEC_EFFTW
should technically only be used for the limited area version of ecTrans, etrans. @dmitrypek only used this because it happened to already have the correct array dimension ordering.