Closed stevengj closed 2 years ago
I'd say CNTVCT_EL0 should probably be enabled by default on all aarch64 target, as it is architecturally available. PMCCNTR_EL0 requires being enabled somehow (privileged by default), and only offers benefits on systems with low-frequency CNTVCT_EL0.
@rdolbeau, in your original patch for ARM support (8aa91763af07767f3ebb71a9836a69e3b3385cab), you tested for CNTVCT_EL0
on aarch64 and commented that it was "not always" available: https://github.com/FFTW/fftw3/blob/8aa91763af07767f3ebb71a9836a69e3b3385cab/configure.ac#L628-L637
What has changed?
@stevengj I don't remember the details, but from the commit message they might not have been user-readable on some older Linux kernels ? And from this commit, it is available in MacOSX on Arm in addition to Linux. Not sure about e.g. NetBSD or FreeBSD, I have to say.
I thinks it's better to enable it so people get the benefits of some somewhat accurate counter, as long as they can disable it if needed or if they want the PMC counter instead.
It seems that
CNTVCT_EL0
support is enabled by default on macOS running on the M1 (aarch64), but notPMCCNTR_EL0
. It makes sense to me to enable the former by default, as users are unlikely to know about this and not having a cycle counter really degrades FFTW's performance.cc @matteo-frigo
PS. Sorry for all of the whitespace changes; my editor is set to automatically delete trailing whitespace in
.c
files.