cms-sw / cmssw

CMS Offline Software
http://cms-sw.github.io/
Apache License 2.0
1.08k stars 4.32k forks source link

Shall we fix all those inefficient (improper?) use of "pow"? #44414

Open VinInn opened 7 months ago

VinInn commented 7 months ago

most of our use of pow is inefficient if not improper: https://github.com/search?q=repo%3Acms-sw%2Fcmssw+pow%28+language%3AC%2B%2B&type=code&l=C%2B%2B&p=5 1) pow(x,2) : no the compiler will not substitute with xx: we need to introduce our own inline function "square" 2) pow(x,0.5) pow(x,1/3) result correct (yes even for negative x) still a direct call to sqrt and cbrt may be more efficient 3) pow(2,j), pow(10,j), pow (-1, j): no comment 4) pow(x,j) in polynomial: Horner shall be used 5) pow(a,y) where a is a literal: exp(ylog(a)) should be faster provided that the compiler makes log(a) constexpr

shall we try to fix all or them or wait that some go above the threshold of our performance-radar?

cmsbuild commented 7 months ago

cms-bot internal usage

cmsbuild commented 7 months ago

A new Issue was created by @VinInn.

@smuzaffar, @antoniovilela, @Dr15Jones, @sextonkennedy, @rappoccio, @makortel can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

slava77 commented 7 months ago

pow(x,2) : no the compiler will not substitute with x*x: we need to introduce our own inline function "square"

depends on the optimization flags according to some old posts

dan131riley commented 7 months ago

pow(x,2) : no the compiler will not substitute with x*x: we need to introduce our own inline function "square"

depends on the optimization flags according to some old posts

At the very least we should be encouraging using std::pow(), std::sqrt() etc., which will let the compiler use the appropriate type-matched inlined function. This may invoke a compiler builtin implementation, and should avoid any unnecessary type promotions to double (esp. important consideration for vectorization). I'm not convinced it's worth a campaign w/o evidence that these are performance bottlenecks.

VinInn commented 7 months ago

indeed. but "std::pow(x,2)+std::pow(y,2);" is promoted to double... (as any literal non explicitely float) https://godbolt.org/z/ab1Koqvd8

makortel commented 7 months ago

Interesting that more complex expression lead to such differences (although maybe not entirely surprising either), I'd say that discovery increases the motivation to do something. Some evidence on the cost would still be nice.

dan131riley commented 7 months ago

Hadn't thought of that. Will also note (one of my hobby horses) that "std::pow(x, 1.f/3.f);" is very different from "std::pow(x, 1./3.);". The former jumps straight to powf(), while the latter converts everything to double, calls pow(), and converts the result back to float.

VinInn commented 7 months ago

according to experts: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114363

float z = pow(x,2) is immediately converted to z=x*x because it is guaranteed to produce the same results of invoking the double precision function. all other cases not, so the optimization cannot be applied.

dan131riley commented 7 months ago

Tying back to my previous comment, "std::pow(x,2.f)+std::pow(y,2.f);" avoids the type promotion, and compiles down to two instructions if FMA is available. Fun, wasn't expecting that either.

VinInn commented 7 months ago

@dan131riley yeah, we shall just teach people to use 2.f ...

dan131riley commented 7 months ago

@dan131riley yeah, we shall just teach people to use 2.f ...

Oh, I agree, teaching average physicists these corner cases is a non-starter. I mostly have the MkFit developers on board, but that's an experienced bunch. Just thinking about what we can do...