Target chips for each instruction set

omnisip commented 3 years ago

@zeux 's example from the meeting (#369) using Godbolt using llvm to do MCA analysis is great. It gives us at the very least, LLVM's cost analysis and cycle predictions by port usage for any given cpu (with -mcpu flag). You can see it here.

If this is scalable and accurate, it could provide a very efficient way to meet @penzn and @lars-t-hansen suggestions for performance analyses associated with each PR without adding any undue burden. In that spirit, what specific chips we're using as targets for our analyses. That way we can know that if we're adding instructions, we're focusing our efforts on performance improvements that help the most users.

Which would you pick for the following four targets? x86 (32-bit)? x86_64? ARM64? ARMv7+neon?

Actual models or architectures are helpful if they can tell us projected UOPS and Port Usage.

dtig commented 3 years ago

Maybe I'm misunderstanding this question, but the operations in this proposal are high level enough that differences in chips should not matter. IMO having a broad requirement (for example ARMv7+Neon) is more useful than adding target chips.

For those that weren't at the meeting the meeting that this issue came out of, can you elaborate on what suggestions this issue is trying to address?

tlively commented 3 years ago

This is for calculating cost models for use in LLVM and other compilers that will produce WebAssembly. The way we are planning to produce a cost model is to generate cost models for specific architectures of interest and combine them in some principled manner to be determined. Since real costs change by microarchitecture, we're not sure how to generate a cost model for e.g. ARMv7+Neon without selecting specific microarchitectures of interest.

WebAssembly / simd

Target chips for each instruction set #387