Open oscardssmith opened 2 weeks ago
Thanks! Yeah, that's a good item for the to-do list :)
Unfortunately it's probably not going to happen any time soon, as it's not easy to pair up the data from Arm's PDFs (e.g. https://developer.arm.com/documentation/109842/latest/), and I don't think Apple's SME/SSVE implementation currently has published timing info, so that would be a much bigger project.
I'm also concerned that publishing partial or incomplete information might be misleading; if someone sees a list of timings, they might assume CPUs not on that list don't support the instruction, especially if the CPU appears on other lists. So, yeah, it's not a perfect fit for this format, but I would like to include something basic like you see in the Intel® Intrinsics Guide: https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html
Also, SVE still feels very uncommon to me, so feel free to let me know what you're using, or what you consider common architectures.
This site is awesome (especially compared to Arm's official stuff which is impossible). Would it be possible to add frequency and throughput for instructions for some of the common architectures (like uops.info does)? Knowing which instructions are generally slow vs fast can be very helpful for compiler devs and such who want to figure out which instructions are good vs bad to generate.