Open dcsale opened 9 years ago
try adding compiler flags for OpenFOAM in: /wmake/rules/linux64Gcc/c++Opt and /wmake/rules/linux64Gcc/cOpt
more info about how to enable AVX here: http://stackoverflow.com/questions/943755/gcc-optimization-flags-for-xeon To enable AVX, the short of it is:
-march=corei7-avx for GCC < 4.9.0 or -march=sandybridge for GCC >= 4.9.0
The following will show you all the flags your processor supports:
cat /proc/cpuinfo | grep flags | head -1
or
gcc -march=native -mfpmath=sse -O2 -Q --help=target -v
GCC depresses SSEx instructions when -mavx is used. Instead, it generates new AVX instructions or AVX equivalence for all SSEx instructions when needed. So try in combination these flags:
gcc -march=corei7-avx -mtune
further further explanation at: http://stackoverflow.com/questions/10559275/gcc-how-is-march-different-from-mtune
it seems to me that this should provide greatest compatibility with different CPUs (at least intels)
-march=corei7-avx -mtune=generic
I tested OpenFOAM v2.4.x compiled with and without the AVX instructions (on Intel Xeon 2630 v3 "Sandy Bridge"). I ran a few trials of the pisoFoamTurbine solver for 80 iterations. The compiler flags were passed to OpenFOAM and also the FAST Fortran code. Here are the elapsed wall-times:
compiler flags | time [s] | avg. time [s] |
---|---|---|
none | [244, 281, 333, 308, 261, 316] | 291 |
-march=corei7-avx -mtune=generic | [372, 270, 279, 241] | 291 |
-march=core-avx2 | [244, 283, 271] | 266 |
The flag -march=core-avx2 (note -march=X implies -mtune=X) seems to provide most aggresive optimization, and potential for nearly 10% speedup. Nothing else was running on the computer so wonder why such a large variability between different runs? Anyways, perhaps there is something beneficial about AVX2, so I will leave the flag enabled.
next should experiment with flags for mpirun. Namely the bind to socket, or bind to core options ...
supposedly an additional compiler flag can give a nice speedup on Xeon 2630 v3 "Sandy Bridge" processors .... these CPUs have new instuction set called AVX. Try the GNU compiler flag "-march=corei7-avx"
More info here: http://gcc.gnu.org/onlinedocs/gcc-4.8.0/gcc/i386-and-x86_002d64-Options.html and https://www.microway.com/hpc-tech-tips/achieve-the-best-performance-intel-xeon-e5-2600-sandy-bridge/