It is really interesting to see that Numba gets faster than Julia. However, during my repro, I found some benchmarks in this repo quite unrelated to the conclusions.
TL;DR;: After fixing the benchmark, the examination shows that Julia is >10x faster than Numba for the case evaluate_functions.
Consider the case evaluate_functions, numba uses parallel for loop. It seems to produce faster programs than other languages, however it is quite misunderstanding.
Programs using prange gives different results from that use range, while the latter is the behaviour of the corresponding Julia program.
In [17]: @njit(parallel=True)
...: def evaluate_functions(n):
...: """
...: Evaluate the trigononmetric functions for n values evenly
...: spaced over the interval [-1500.00, 1500.00]
...: """
...: vector1 = np.linspace(-1500.00, 1500.0, n)
...: iterations = 10000
...: for i in range(iterations):
...: vector2 = np.sin(vector1)
...: vector1 = np.arcsin(vector2)
...: vector2 = np.cos(vector1)
...: vector1 = np.arccos(vector2)
...: vector2 = np.tan(vector1)
...: vector1 = np.arctan(vector2)
...: return vector1
...:
In [18]: evaluate_functions(10)
Out[18]:
array([1.46030424, 1.13579218, 0.81128013, 0.48676808, 0.16225603,
0.16225603, 0.48676808, 0.81128013, 1.13579218, 1.46030424])
In [19]: @njit(parallel=True)
...: def evaluate_functions(n):
...: """
...: Evaluate the trigononmetric functions for n values evenly
...: spaced over the interval [-1500.00, 1500.00]
...: """
...: vector1 = np.linspace(-1500.00, 1500.0, n)
...: iterations = 10000
...: for i in prange(iterations):
...: vector2 = np.sin(vector1)
...: vector1 = np.arcsin(vector2)
...: vector2 = np.cos(vector1)
...: vector1 = np.arccos(vector2)
...: vector2 = np.tan(vector1)
...: vector1 = np.arctan(vector2)
...: return vector1
...:
In [20]: evaluate_functions(10)
Out[20]:
array([-1500. , -1166.66666667, -833.33333333, -500. ,
-166.66666667, 166.66666667, 500. , 833.33333333,
1166.66666667, 1500. ])
The Julia behaviour:
function evaluatefunctions(N)
#x = linspace(-1500.0, 1500.0, N)
x = collect(range(-1500.0, stop=1500.0, length=N))
M = 10000
for i in 1:M
y = sin.(x)
x = asin.(y)
y = cos.(x)
x = acos.(y)
y = tan.(x)
x = atan.(y)
end
return x
end
julia> evaluatefunctions(N)
10-element Vector{Float64}:
1.4603042376686257
1.135792184853451
0.8112801320381631
0.48676807922287524
0.16225602640761583
0.16225602640761583
0.48676807922287524
0.8112801320381631
1.135792184853451
1.4603042376686257
Fixing the incorrect use of prange gives a fair result:
In [16]: %timeit evaluate_functions(10)
55.4 ms ± 2.95 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
julia> N = 10
10
julia> @btime evaluatefunctions(N)
4.224 ms (60001 allocations: 8.24 MiB)
It is really interesting to see that Numba gets faster than Julia. However, during my repro, I found some benchmarks in this repo quite unrelated to the conclusions.
TL;DR;: After fixing the benchmark, the examination shows that Julia is >10x faster than Numba for the case
evaluate_functions
.Consider the case
evaluate_functions
, numba usesparallel
for loop. It seems to produce faster programs than other languages, however it is quite misunderstanding.Programs using
prange
gives different results from that userange
, while the latter is the behaviour of the corresponding Julia program.The Julia behaviour:
Fixing the incorrect use of
prange
gives a fair result: