Not all architectures has a fast 64-bit imul + imm. But even on modern like SnB-family and AMD Ryzen it takes 3 cycle latency, 1c throughput which not always faster lea + shl / add combination. So I propose use lowering to lea + shl / add for non-power of two constants at least for imm < 400 with low hamming weight and 64-bit imul only if this possible. Similar to GCC:
Feature
Not all architectures has a fast 64-bit imul + imm. But even on modern like SnB-family and AMD Ryzen it takes
3 cycle
latency,1c throughput
which not always faster lea + shl / add combination. So I propose use lowering to lea + shl / add for non-power of two constantsat least forwith low hamming weight and 64-bit imul only if this possible. Similar to GCC:imm < 400
https://godbolt.org/z/aG7bPer9v