Open a-szegel opened 1 year ago
FWIW (meaning very little) as 1 random data point, with the tcp provider gcc -O2 produced the same performance as icc -O3 or icx -O3, for Intel MPI 16n48ppn SpecMPI and SpecHPC runs. The data wasn't being collected to compare compiler optimizations levels, so while the optimization levels changed between runs, so did the compiler... If anything, the -O3 runs were 1-2% lower in a couple of tests, but that should be within run-to-run variation.
We need data across a variety of platforms to show that this is worth it. Making this an issue to track it b/c it is better place than a PR.
https://github.com/ofiwg/libfabric/pull/8961