Open VinInn opened 7 months ago
cms-bot internal usage
A new Issue was created by @VinInn.
@smuzaffar, @antoniovilela, @Dr15Jones, @sextonkennedy, @rappoccio, @makortel can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
pow(x,2) : no the compiler will not substitute with x*x: we need to introduce our own inline function "square"
depends on the optimization flags according to some old posts
pow(x,2) : no the compiler will not substitute with x*x: we need to introduce our own inline function "square"
depends on the optimization flags according to some old posts
At the very least we should be encouraging using std::pow()
, std::sqrt()
etc., which will let the compiler use the appropriate type-matched inlined function. This may invoke a compiler builtin implementation, and should avoid any unnecessary type promotions to double (esp. important consideration for vectorization). I'm not convinced it's worth a campaign w/o evidence that these are performance bottlenecks.
indeed. but "std::pow(x,2)+std::pow(y,2);" is promoted to double... (as any literal non explicitely float) https://godbolt.org/z/ab1Koqvd8
Interesting that more complex expression lead to such differences (although maybe not entirely surprising either), I'd say that discovery increases the motivation to do something. Some evidence on the cost would still be nice.
Hadn't thought of that. Will also note (one of my hobby horses) that "std::pow(x, 1.f/3.f);" is very different from "std::pow(x, 1./3.);". The former jumps straight to powf(), while the latter converts everything to double, calls pow(), and converts the result back to float.
according to experts: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114363
float z = pow(x,2) is immediately converted to z=x*x because it is guaranteed to produce the same results of invoking the double precision function. all other cases not, so the optimization cannot be applied.
Tying back to my previous comment, "std::pow(x,2.f)+std::pow(y,2.f);" avoids the type promotion, and compiles down to two instructions if FMA is available. Fun, wasn't expecting that either.
@dan131riley yeah, we shall just teach people to use 2.f ...
@dan131riley yeah, we shall just teach people to use 2.f ...
Oh, I agree, teaching average physicists these corner cases is a non-starter. I mostly have the MkFit developers on board, but that's an experienced bunch. Just thinking about what we can do...
most of our use of pow is inefficient if not improper: https://github.com/search?q=repo%3Acms-sw%2Fcmssw+pow%28+language%3AC%2B%2B&type=code&l=C%2B%2B&p=5 1) pow(x,2) : no the compiler will not substitute with xx: we need to introduce our own inline function "square" 2) pow(x,0.5) pow(x,1/3) result correct (yes even for negative x) still a direct call to sqrt and cbrt may be more efficient 3) pow(2,j), pow(10,j), pow (-1, j): no comment 4) pow(x,j) in polynomial: Horner shall be used 5) pow(a,y) where a is a literal: exp(ylog(a)) should be faster provided that the compiler makes log(a) constexpr
shall we try to fix all or them or wait that some go above the threshold of our performance-radar?