FFTW / fftw3

DO NOT CHECK OUT THESE FILES FROM GITHUB UNLESS YOU KNOW WHAT YOU ARE DOING. (See below.)
GNU General Public License v2.0
2.67k stars 652 forks source link

default to CNTVCT_EL0 cycle counter on Apple M1 #267

Closed stevengj closed 2 years ago

stevengj commented 2 years ago

It seems that CNTVCT_EL0 support is enabled by default on macOS running on the M1 (aarch64), but not PMCCNTR_EL0. It makes sense to me to enable the former by default, as users are unlikely to know about this and not having a cycle counter really degrades FFTW's performance.

cc @matteo-frigo

PS. Sorry for all of the whitespace changes; my editor is set to automatically delete trailing whitespace in .c files.

rdolbeau commented 2 years ago

I'd say CNTVCT_EL0 should probably be enabled by default on all aarch64 target, as it is architecturally available. PMCCNTR_EL0 requires being enabled somehow (privileged by default), and only offers benefits on systems with low-frequency CNTVCT_EL0.

stevengj commented 2 years ago

@rdolbeau, in your original patch for ARM support (8aa91763af07767f3ebb71a9836a69e3b3385cab), you tested for CNTVCT_EL0 on aarch64 and commented that it was "not always" available: https://github.com/FFTW/fftw3/blob/8aa91763af07767f3ebb71a9836a69e3b3385cab/configure.ac#L628-L637

What has changed?

rdolbeau commented 2 years ago

@stevengj I don't remember the details, but from the commit message they might not have been user-readable on some older Linux kernels ? And from this commit, it is available in MacOSX on Arm in addition to Linux. Not sure about e.g. NetBSD or FreeBSD, I have to say.

I thinks it's better to enable it so people get the benefits of some somewhat accurate counter, as long as they can disable it if needed or if they want the PMC counter instead.