10.0^68 gives a different result on nightly vs 1.11 after llvm-muladd pass removal

KristofferC commented 1 month ago

Master:

julia> 10.0^68
1.0000000000000001e68

1.11:

julia> 10.0^68
1.0e68

https://github.com/JuliaLang/julia/pull/55802 changed stuff related to muladd which pow uses at

https://github.com/JuliaLang/julia/blob/2a06376c18afd7ec875335070743dcebcd85dee7/base/math.jl#L1206

Putting on milestone so it isn't forgotten, not to say it is a regression (I don't know).

Seelengrab commented 1 month ago

Arguably, the new result is more correct, no? It's exact now, and used to have an error (within one ulp, probably?).

giordano commented 1 month ago

% julia +1.8 -E '10.0 ^ 68'
1.0000000000000001e68
% julia +1.9 -E '10.0 ^ 68'
1.0000000000000001e68
% julia +1.10 -E '10.0 ^ 68'
1.0e68
% julia +1.11 -E '10.0 ^ 68' 
1.0e68
% julia +nightly -E '10.0 ^ 68'
1.0000000000000001e68

Note that #49405 first appeared in v1.10, and #55802 restored the muladd behaviour outside of @simd on master.

KristofferC commented 1 month ago

It's exact now, and used to have an error (within one ulp, probably?).

Maybe I am dumb but isn't it the opposite?

KristofferC commented 1 month ago

Note that https://github.com/JuliaLang/julia/pull/49405 first appeared in v1.10, and https://github.com/JuliaLang/julia/pull/55802 restored the muladd behaviour outside of @simd on master.

I still don't want to backport a behavior change like that (PkgEval caught some new errors for example).

Seelengrab commented 1 month ago

Maybe I am dumb but isn't it the opposite?

You're right, I mixed up the versions 🤦

giordano commented 1 month ago

https://github.com/JuliaLang/julia/pull/55802 changed stuff related to muladd which pow uses at

https://github.com/JuliaLang/julia/blob/2a06376c18afd7ec875335070743dcebcd85dee7/base/math.jl#L1206

I think you're pointing to the wrong pow_body function, should be pow_body(x::Float64, n::Integer), not pow_body(x::Float64, n::Float64), but there's still a muladd call there: https://github.com/JuliaLang/julia/blob/2a06376c18afd7ec875335070743dcebcd85dee7/base/math.jl#L1275

The difference in code_llvm(Base.Math.pow_body, (Float64, Int)) is this in v1.11

L124:                                             ; preds = %L119, %L28.preheader
  %value_phi6.lcssa = phi double [ %value_phi6.ph, %L28.preheader ], [ %25, %L119 ]
  %value_phi7.lcssa = phi double [ 0.000000e+00, %L28.preheader ], [ %value_phi22, %L119 ]
  %value_phi8.lcssa = phi double [ 1.000000e+00, %L28.preheader ], [ %value_phi23, %L119 ]
  %value_phi9.lcssa = phi double [ %value_phi9.ph, %L28.preheader ], [ %22, %L119 ]
;  @ math.jl:1224 within `pow_body`
; ┌ @ float.jl:493 within `*`
   %28 = fmul contract double %value_phi7.lcssa, %value_phi9.lcssa
; └
; ┌ @ float.jl:496 within `muladd`
   %29 = fmul contract double %value_phi6.lcssa, %value_phi8.lcssa
   %30 = fadd contract double %29, %28
; └
;  @ math.jl:1225 within `pow_body`
; ┌ @ float.jl:705 within `isfinite`
; │┌ @ float.jl:492 within `-`
    %31 = fsub double %value_phi9.lcssa, %value_phi9.lcssa
    %32 = fsub double %30, %30
; └└
; ┌ @ bool.jl:38 within `&`
   %33 = fcmp uno double %31, %32
; └
; ┌ @ float.jl:496 within `muladd`
   %34 = fmul contract double %value_phi8.lcssa, %value_phi9.lcssa
; └
; ┌ @ essentials.jl:796 within `ifelse`
   %35 = select i1 %33, double -0.000000e+00, double %30
   %36 = fadd contract double %34, %35
   br label %common.ret
; └

vs this in nightly:

L124:                                             ; preds = %L119, %L28.preheader
  %value_phi2.lcssa = phi double [ %value_phi2.ph, %L28.preheader ], [ %25, %L119 ]
  %value_phi3.lcssa = phi double [ 0.000000e+00, %L28.preheader ], [ %value_phi10, %L119 ]
  %value_phi4.lcssa = phi double [ 1.000000e+00, %L28.preheader ], [ %value_phi11, %L119 ]
  %value_phi5.lcssa = phi double [ %value_phi5.ph, %L28.preheader ], [ %22, %L119 ]
;  @ math.jl:1275 within `pow_body`
; ┌ @ float.jl:493 within `*`
   %28 = fmul double %value_phi3.lcssa, %value_phi5.lcssa
; └
; ┌ @ float.jl:496 within `muladd`
   %29 = fmul contract double %value_phi2.lcssa, %value_phi4.lcssa
   %30 = fadd contract double %29, %28
; └
;  @ math.jl:1276 within `pow_body`
; ┌ @ float.jl:705 within `isfinite`
; │┌ @ float.jl:492 within `-`
    %31 = fsub double %value_phi5.lcssa, %value_phi5.lcssa
    %32 = fsub double %30, %30
; └└
; ┌ @ bool.jl:38 within `&`
   %33 = fcmp uno double %31, %32
; └
; ┌ @ float.jl:496 within `muladd`
   %34 = fmul double %value_phi4.lcssa, %value_phi5.lcssa
; └
; ┌ @ essentials.jl:790 within `ifelse`
   %35 = select i1 %33, double -0.000000e+00, double %30
   %36 = fadd contract double %34, %35
   br label %common.ret

Note the fmul which now isn't annotated as contract anymore, which was precisely the point of #55802 to fix #55785.

KristofferC commented 1 month ago

Note the fmul which now isn't annotated as contract anymore, which was precisely the point of https://github.com/JuliaLang/julia/pull/55802 to fix https://github.com/JuliaLang/julia/issues/55785.

You might be misunderstanding the point of this issue. I am not saying there is anything wrong with #55802. I am saying that the pow implementation might be wrong in how it relied on this bug since it now gives a "wrong" answer.

oscardssmith commented 1 month ago

How does the performance compare between before and after? I am slightly worried that I might have been accidentally relying on LLVM re-associating all my math.

giordano commented 1 month ago

You might be misunderstanding the point of this issue

What I tried to show above is that the current result has been the same since v1.8, except for the 1.10-1.11 window. Since that method has last been touched in the v1.9 time frame I don't see how that relied on a future bug.

KristofferC commented 1 month ago

It could have been wrong back then as well? All I know is a good result turned worse which seems like a regression no matter what values it used to give on old julia versions. The history doesn't really matter.

mbauman commented 1 month ago

Sure, but just because Julia grants (or granted) the ability to reassociate doesn't mean that the compiler does (or did). And it doesn't mean the implementation isn't (or wasn't) buggy.

On ARM64 (M1 Pro), I only see the +1 ULP error on v1.8; all others give 1.0e68.

These two lines are probably not meant to reassociate in the ways that muladd permits:

https://github.com/JuliaLang/julia/blob/2a06376c18afd7ec875335070743dcebcd85dee7/base/math.jl#L1275-L1276

oscardssmith commented 1 month ago

in general, we don't promise <1 ULP accuracy for pow. it's nice if it works out well for integers, but that isn't a requirement

oscardssmith commented 1 month ago

those were vaguely meant to reassociate there isn't a strong a-priori region to believe one is better than another

PallHaraldsson commented 1 month ago

in general, we don't promise <1 ULP accuracy for pow. it's nice if it works out well for integers, but that isn't a requirement

Shouldn't we, or could for integers and other accurate numbers? I mean we do for BigInt, just thinking, wouldn't exact also be possible for floats until the numbers are too big, and then less than <1 ULP error?

giordano commented 1 month ago

Shouldn't we, or could for integers and other accurate numbers?

Are you really sure the number 1e68 is accurate, for some definition of accuracy?

julia> big(1e68)
9.9999999999999995280522225138166806691251291352861698530421623488512e+67

PallHaraldsson commented 1 month ago

Neither is correct (since impossible), I meant should Julia strive for until it's not possible, i.e. for integers higher than 2^53 right?

[Yes, you're right, it's more accurate. EDITED out my incorrect logic here.]

oscardssmith commented 1 month ago

1e68 is slightly more accurate (otherwise that would be a bug in ryu). It's 0.4 vs 0.6 ULPs, but it's well within the accuracy range of any reasonable pow implementation.

julia> big(10)^68-nextfloat(10.0^68)
-7.253143638152923512615837440964652195551821015547904e+51

julia> big(10)^68-1e68
4.719477774861833193308748708647138301469578376511488e+51

JuliaLang / julia

10.0^68 gives a different result on nightly vs 1.11 after llvm-muladd pass removal #56312