All avx512 capable Skylake CPUs and later currently prefer 256-bit vs 512-bit vector generation - only the Knights Landing family always prefers 512-bit (very different arch, no VLX support etc.).
AFAICT many of the frequency throttling issues with 512-bit ops were already addressed by/after Icelake - so do we need to alter the tuning for later cpus to let them vectorize to the full 512-bit alu width?
All avx512 capable Skylake CPUs and later currently prefer 256-bit vs 512-bit vector generation - only the Knights Landing family always prefers 512-bit (very different arch, no VLX support etc.).
AFAICT many of the frequency throttling issues with 512-bit ops were already addressed by/after Icelake - so do we need to alter the tuning for later cpus to let them vectorize to the full 512-bit alu width?