Closed nathanl closed 5 years ago
:wave:
benchee measures using a monotonic clock, so all time elapsed is considered no matter where it was spent.
The more likely source of this always being the same (imo) is that the signed value is cached somewhere (for a combination of secret, salt and input) and then just returned more or less immediately upon further invocation. As warmup isn't counted there should be no measurable difference.
I can see a couple of ways around this:
Of course all of these rely on my assumption that cache is involved which right now is the only way that I could explain that behaviour but I might be wrong :)
Hi,
the behaviour heavily hints at what I was suspecting.
I mildy modified it (not as many iterations as it takes too long) and added a small custom formatter to look at the raw run times:
inputs: %{
1 => 1,
20_000 => 20_000
},
warmup: 0,
time: 1,
formatters: [
{Benchee.Formatters.Console, extended_statistics: true},
fn suite ->
Enum.each(suite.scenarios, fn scenario ->
IO.puts("#{scenario.name}:")
IO.inspect(scenario.run_time_data.samples)
end)
end
]
)
as you can see, I also took warmup out of the equation (because that one big time where it is first calculated happens during warmup and hence doesn't show up in the results at all).
Here's what I get:
sign_and_verify:
[3464088.0, 55764.0, 30283.0, 29346.0, 26398.0, 27421.0, 26489.0, 26442.0,
25288.0, 27206.0, 25236.0, 25531.0, 2.34e4, 22253.0, 24424.0, 25257.0, 25676.0,
29571.0, 23954.0, 26265.0, 22351.0, 27024.0, 22443.0, 21854.0, 27460.0,
41186.0, 26549.0, 29703.0, 23529.0, 22679.0, 29550.0, 22873.0, 2.23e4, 27988.0,
23040.0, 23002.0, 27774.0, 22435.0, 22208.0, 26973.0, 22513.0, 22569.0,
27488.0, 24514.0, 22431.0, 29543.0, 32822.0, 38169.0, 25355.0, 25009.0, ...]
sign_and_verify:
[40175774.0, 39955.0, 29489.0, 26335.0, 23168.0, 24290.0, 24974.0, 25533.0,
25953.0, 26969.0, 23825.0, 27584.0, 22642.0, 26907.0, 22638.0, 22058.0,
27950.0, 22145.0, 21529.0, 27354.0, 22368.0, 22736.0, 28298.0, 21946.0,
21485.0, 27308.0, 22926.0, 22013.0, 28218.0, 22931.0, 23343.0, 27576.0,
22926.0, 24748.0, 39374.0, 23984.0, 28529.0, 45464.0, 29909.0, 25449.0,
26165.0, 22488.0, 22248.0, 23947.0, 24769.0, 25590.0, 27690.0, 22967.0,
27042.0, 23329.0, ...]
you can see that after a first big initial time the times normalize. Caching at this point also makes sense so I'd say that's what it is and recommend one of the approaches suggested above :)
Cheers, Tobi
Interesting, and thanks for looking into this. I agree that caching looks like a likely explanation based on your results. However, I've updated the test to use a random UUID as the value to sign, and I still see the pattern you show of long initial execution times tailing off to a shorter time.
Updated code:
defmodule Mix.Tasks.BenchMarkToken do
use Mix.Task
alias MyAppWeb.Endpoint
@shortdoc "Compares token runs"
def run(_) do
Mix.Task.run("app.start")
Benchee.run(
%{
"sign_and_verify" => fn key_iterations ->
uuid = Ecto.UUID.generate()
signed_token =
Phoenix.Token.sign(
Endpoint,
"some salt",
uuid,
key_iterations: key_iterations
)
{:ok, ^uuid} =
Phoenix.Token.verify(
Endpoint,
"some salt",
signed_token,
key_iterations: key_iterations,
max_age: :infinity
)
end
},
inputs: %{
1 => 1,
200_000 => 200_000
},
warmup: 0,
time: 1,
formatters: [
{Benchee.Formatters.Console, extended_statistics: true},
fn suite ->
Enum.each(suite.scenarios, fn scenario ->
IO.puts("#{scenario.name} (#{scenario.input}):")
IO.inspect(scenario.run_time_data.samples)
end)
end
]
)
end
end
produces
sign_and_verify (1):
[7768990.0, 126990.0, 83990.0, 146990.0, 80990.0, 93990.0, 75990.0, 72990.0,
74990.0, 77990.0, 67990.0, 64990.0, 68990.0, 66990.0, 64990.0, 65990.0,
74990.0, 65990.0, 64990.0, 72990.0, 67990.0, 61990.0, 65990.0, 66990.0,
69990.0, 67990.0, 64990.0, 67990.0, 65990.0, 73990.0, 74990.0, 73990.0,
68990.0, 66990.0, 65990.0, 65990.0, 66990.0, 83990.0, 69990.0, 66990.0,
64990.0, 69990.0, 63990.0, 62990.0, 62990.0, 64990.0, 61990.0, 61990.0,
74990.0, 57990.0, ...]
sign_and_verify (200000):
[260805990.0, 91990.0, 78990.0, 60990.0, 59990.0, 59990.0, 56990.0, 60990.0,
55990.0, 58990.0, 53990.0, 53990.0, 53990.0, 55990.0, 52990.0, 52990.0,
55990.0, 53990.0, 51990.0, 53990.0, 53990.0, 53990.0, 54990.0, 49990.0,
53990.0, 53990.0, 58990.0, 60990.0, 56990.0, 62990.0, 49990.0, 48990.0,
47990.0, 47990.0, 53990.0, 48990.0, 47990.0, 46990.0, 50990.0, 45990.0,
44990.0, 45990.0, 47990.0, 44990.0, 64990.0, 57990.0, 55990.0, 55990.0,
56990.0, 54990.0, ...]
It still could be something specific to this function, though. Will tinker further when I have time.
Hey,
I looked at it a bit and both signing and verifying call out to KeyGenerator which takes the number of iterations, the algorithm, the salt and the secret - notably not the data to be signed - to generate a key to do the signing. call is here
That explains the behaviour. First call the key is generated which is slow, subsequent invocations are equally fast as intended.
As the salt needs to vary, you should vary the salt in your benchmarks not the user data. :)
Hope that helps, if anything comes up feel free to reopen.
I ran a benchmark on
Phoenix.Token
to get info about how different:key_iterations
values would impact performance.To my surprise, benchee reports no difference in how long it takes sign + verify with 1 key iteration vs 20,000,000. However, there clearly is a difference, because the wall clock run time is very different depending on which value I use.
Here's the mix task where I run the comparison:
If I comment out one or the other of the
:inputs
, I get large variations in measurement from the mix task itself.For
1
:For
20_000_000
One of my colleagues guessed that this may be explained as follows: