Revisit erroring behaviour of special functions

MasonProtter commented 4 years ago

Obviously, this can't be changed until 2.0, but I think we should revisit the design of special functions such as sqrt, sin, etc. to make them pure functions which cannot error. The fact that sqrt(-1) or sin(Inf) produce errors make many important optimizations impossible, or harder for the compiler to perform. We can solve these things with special cases in the compiler, but that'll quickly turn into optimization whack-a-mole.

The IEEE floating point specification (and computer hardware) already has a great solution for this: NaN. One can argue that producing NaN for these functions is the correct thing to do according to the IEEE spec.

The main problem that I forsee with returning NaN rather than an error is that NaNs are notoriously difficult to track down, but I feel that with the new compiler pass infrastructure (or even the current Cassette.jl / IRTools.jl), making better tooling for detecting NaN production would solve this potential complaint and allow us to make more principled, general optimizations for these functions.

musm commented 4 years ago

Just use https://github.com/mlubin/NaNMath.jl ?

musm commented 4 years ago

MasonProtter commented 4 years ago

I've been told to use NaNMath several times, but first of all NaNMath.jl does not solve the problem with optimizations here because it uses ccalls that the compiler can't prove are pure, so it has the same issue (in addition to being slower than Base):

julia> using StaticArrays, NaNMath, BenchmarkTools

julia> let v = @SVector(rand(20)), x = 5.0
           sinx = NaNMath.sin(x)
           @btime $v .+ NaNMath.sin($x)
           @btime $v .+ $sinx
           @btime $v .+ sin($x)
       end;
  13.175 ns (0 allocations: 0 bytes)
  0.019 ns (0 allocations: 0 bytes)
  10.801 ns (0 allocations: 0 bytes)

If the compiler knew that Base.sin or NaNMath.sin were pure functions, then all three of those benchmarks would be the same.

Second of all, I think if Base is going to provide these functions, it's worth having Base do them right rather than rely on external packages to patch performance problems.

musm commented 4 years ago

There is an 10-15% speed increase in performance between NaNMath.sin and Base.sin due to the fact that Base.sin is completely implemented in julia based on the openlibm code, while NaNMath.sin is calling the openlibm library directly, and the Julia implementation runs faster. This entirely explains the performance difference between NaNMath.sin and Base.sin.

Second, if you implement Base.sin and remove the domain errors and purely uses NaN to signal error behavior, this leads to zero difference in performance.

If the compiler knew that Base.sin or NaNMath.sin were pure functions, then all three of those benchmarks would be the same.

So, we have just explained why NaNMath is slightly slower.

Let's examine:

           @btime $v .+ $sinx # ex1
           @btime $v .+ sin($x) # ex2

You are comparing summing two arrays (ex1) vs summing two arrays and computing sin (ex2). Just look at the timings. In ex1 it's 0.02 ns, there's no way to compute sin over an array in 0.02ns.

Please look at: https://gist.github.com/musm/1080639da2c5575cfe0fca39d77f70b6

So I see no performance issues

MasonProtter commented 4 years ago

@musm the entire point I’m highlighting here is that the functions being pure allows optimizations like hoisting out of loops. I know sin can’t be computed that fast, I want the compiler to be able to use constant propagation and compile time evaluation on these functions.

vchuravy commented 4 years ago

The most important optimization we are currently missing is LICM (loop invariant code motion) and I have been thinking about how we might get there and IIUC it doesn't require a change in error behavior.

To explain why let's look at a function we currently fail to optimize.

function f(x, A)
  for i in 1:length(A)
      A[i] = A[i] + sin(x)
  end
end

This turns into:

@label Entry
    N = length(A)

@label EntryCheck
    cond = N > 0
   !cond && @goto Exit

@label LoopHeader
   i = 0

@label Loop
   i += 1
   A[i] = A[i] + sin(x)
   cond = i < N
   cond && @goto Loop

@label Exit
   return

This is valid Julia code, but I find it useful to have names for the regions of code. We want to move sin(x) out of Loop and currently LLVM can't do this for us because it can't prove that the call to sin is side-effect-free. The kinds of side effect to worry about are:

Throws an error
Accesses global memory (does the behavior change with the outside world or changes the outside world)
Writes to memory passed as an argument
Reading from memory passed as an argument

Functions we can't hoist fall into buckets 2. or 3. Bucket 4. we can hoist if the memory read from is loop invariant, and functions in bucket 1. can be generally hoisted, but need to be hoisted to the right location.

Ideally we want to hoist to the BB called Entry, but for that to be true we must prove that we can at least execute the function once. If we hoist sin there and x=Inf and length(A)==0 we would cause a spurious exception and changed semantics. But fear not we can still hoist the function into the BB labeled LoopHeader. Where we guarantee that the code will only be executed when the loop will be executed at least once.

I might have missed some finer details, but generally speaking throwing as a side-effect is fine.

JuliaLang / julia

Revisit erroring behaviour of special functions #36786