Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Why FeatureSlowPMULLD is not set for Haswell+? #34921

Open Quuxplusone opened 6 years ago

Quuxplusone commented 6 years ago
Bugzilla Link PR35948
Status NEW
Importance P enhancement
Reported by Ivan G (nekotekina@gmail.com)
Reported on 2018-01-15 05:12:59 -0800
Last modified on 2021-10-05 09:36:32 -0700
Version 6.0
Hardware All All
CC craig.topper@gmail.com, gadi.haber@intel.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, spatel+llvm@rotateright.com
Fixed by commit(s)
Attachments
Blocks PR32325
Blocked by
See also PR34474, PR52039

Hello, it seems that LLVM sets this flag for Silvermont processors, but not others. On Haswell or Skylake processors (for example), PMULLD has latency 10, when other vector multiplication instructions have latency 5.

https://bugs.llvm.org/show_bug.cgi?id=28128 seems related.

Quuxplusone commented 6 years ago

Craig/Gadi - any thoughts?

Quuxplusone commented 6 years ago

Silvermont has the additional problem that the reciprocal throughput is also high. Haswell's reciprocal througput is 2 so its more reasonable.

This is only checked in reduceVMULWidth right? I don't think it would change anything about PR28128 since I don't think we can reduce the mul width there.

I haven't looked closely at the alternative sequences reduceVMULwidth will generate.

Quuxplusone commented 6 years ago

PR28128 - we probably need to bump the 4i32/v8i32 mul costs in X86TargetTransformInfo.cpp, but I can't see it affecting the vectorization decision tbh.

I keep meaning to more aggressively use X86ISD::VPMADDWD like Peter suggested on D41484, but again that's a special case.

Quuxplusone commented 6 years ago
Another potential user of this feature flag came up in bug 34474 (mul by pow2
+/- 1) solved by this patch:
https://reviews.llvm.org/D52195
Quuxplusone commented 6 years ago
(In reply to Sanjay Patel from comment #4)
> Another potential user of this feature flag came up in bug 34474 (mul by
> pow2 +/- 1) solved by this patch:
> https://reviews.llvm.org/D52195

https://godbolt.org/z/Vq9Pi2 says this is a definite win on Haswell, Broadwell,
and Skylakes