Closed hissssst closed 5 months ago
I have benchmarked the happy path (aka calling function with cached data) and on Erlang 25 w/o JIT there is no difference and on R26 w/ JIT enabled try
version is almost 2x faster. So if you do not have benchmarks that says otherwise then I would keep current code as is.
And to be exact - happy path is way more important there. The sad path will ideally happen only once during application lifetime, so that is not an issue.
@hauleth FYI: try/catch slows down even a happy path. Difference in performance is 30%. Here's a benchmark
defmodule X do
def trycatch(x) do
:persistent_term.get(x)
catch
:error, :badarg ->
:persistent_term.put(x, 123)
123
end
def pg(x) do
with [] <- :persistent_term.get(x, []) do
:persistent_term.put(x, 123)
123
end
end
end
X.trycatch(:one)
X.pg(:two)
Benchee.run(%{
"try" => fn ->
X.trycatch(:one)
end,
"persistent_term" => fn ->
X.pg(:two)
end
})
"""
Operating System: Linux
CPU Information: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
Number of Available Cores: 8
Available memory: 15.34 GB
Elixir 1.15.7
Erlang 25.3.2.11
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 14 s
Benchmarking persistent_term...
Benchmarking try...
Name ips average deviation median 99th %
persistent_term 18.57 M 53.84 ns ±52072.04% 15 ns 174 ns
try 14.63 M 68.34 ns ±38195.07% 14 ns 255 ns
Comparison:
persistent_term 18.57 M
try 14.63 M - 1.27x slower +14.50 ns
"""
There is benchmark in repo that I have used and in OTP 26 difference is nonexistent, while try
can cause problems:
Operating System: macOS
CPU Information: Apple M3 Pro
Number of Available Cores: 12
Available memory: 36 GB
Elixir 1.16.2
Erlang 26.2.5
JIT enabled: true
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 14 s
Benchmarking default ...
Benchmarking try ...
Calculating statistics...
Formatting results...
Name ips average deviation median 99th %
default 24.83 M 40.27 ns ±27339.14% 42 ns 42 ns
try 23.31 M 42.89 ns ±30340.69% 42 ns 42 ns
Comparison:
default 24.83 M
try 23.31 M - 1.07x slower +2.62 ns
Also, depending on the run sometimes try
was faster, sometimes case
was faster. And having fully correct implementation with case
would require creating reference each time we want to check for existence, which would impose additional performance penalty. I think that in current and future versions of OTP it should not be an important problem wrt. performance.
And OTP26 benchmark:
Operating System: Linux
CPU Information: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
Number of Available Cores: 8
Available memory: 15.34 GB
Elixir 1.16.2
Erlang 26.2.3
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 10 s
memory time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 24 s
Benchmarking persistent_term...
Benchmarking try...
Name ips average deviation median 99th %
persistent_term 15.60 M 64.10 ns ±96380.83% 23 ns 154 ns
try 12.69 M 78.79 ns ±94027.45% 35 ns 155 ns
Comparison:
persistent_term 15.60 M
try 12.69 M - 1.23x slower +14.70 ns
depending on the run sometimes try was faster, sometimes case was faster
This is common if you benchmark ns
-precise calls without limiting CPU frequency. I do bind CPU frequency and benchmarks results are consistent for me with +/- 5% difference. But even with this difference, try
is always slower.
And there is also non-benchmark-related argument. try/catch
is slower, since it initiates a routine which instantiates the stracktrace by jumping through all of the labels in the code. Stacktrace is instantiated on every error handling, even when this stacktrace object is not used, and this is done because exceptions can occur while handling an exception, and runtime wants to print two stracktraces in this case. But this is true only for rescue
and catch
clauses
Here's a benchmark which proves it
defmodule X do
for i <- 1..20 do
name = :"f#{i}"
next = :"f#{i + 1}"
def unquote(name)(x) do
unquote(next)(x) + 1
end
end
def f21(x) do
trycatch(x)
end
for i <- 1..20 do
name = :"g#{i}"
next = :"g#{i + 1}"
def unquote(name)(x) do
unquote(next)(x) + 1
end
end
def g21(x) do
pt(x)
end
def trycatch(x) do
:persistent_term.get(x)
catch
:error, :badarg ->
:persistent_term.put(x, 123)
123
end
def pt(x) do
with [] <- :persistent_term.get(x, []) do
:persistent_term.put(x, 123)
123
end
end
end
143 = X.f1(:one)
143 = X.g1(:two)
Benchee.run(%{
"try" => fn ->
X.f1(:one)
end,
"persistent_term" => fn ->
X.g1(:two)
end
}, time: 10)
"""
Operating System: Linux
CPU Information: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
Number of Available Cores: 8
Available memory: 15.34 GB
Elixir 1.15.7
Erlang 25.3.2.11
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 10 s
memory time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 24 s
Benchmarking persistent_term...
Benchmarking try...
Name ips average deviation median 99th %
persistent_term 23.28 M 42.96 ns ±74583.43% 18 ns 138 ns
try 16.34 M 61.20 ns ±88482.67% 21 ns 133 ns
Comparison:
persistent_term 23.28 M
try 16.34 M - 1.42x slower +18.24 ns
"""
What will happen if we want to have correct implementation so we would do:
ref = make_ref()
with ^ref <- :persistent_term.get(x, ref) do
:persistent_term.put(x, 123)
123
end
Because that is what we would need to do to be 100% correct. Because I believe that it may impose non-negligible performance hit as well.
First of all, this wouldn't be 100% correct. Second, just use some atom like :__defconst_defonce_stub__
. I think that handling extremely rare case (which, let's be honest, will never occur in any real world application) is less important than having 30% performance difference on constant access
Try/catch is extremely slow flow. The faster solution would be to
:persistent_term.get(key, default_value)
where default_value is some special value like:__defconst_special_atom__
.The latter is the fastest, but may fail in some extreme scenarios