Closed PragTob closed 2 years ago
Confirmed it's a bug on erlang at least with the reported resolution.
Alrighty. After all of this time I'm here, I'm actually looking at it. Also, sorry for the time but well.. life and stuff. Fuck 2020.
Anyhow.
Investigating this is worse than I anticipated. I initially managed to reproduce but now seemingly can't... I only switched a couple of things around (different formatter) but now... it seems to be working fine? Not sure if it's the mood of the os or whatever.
I changed the bottom of the original script like this:
bench_fun = fn -> View.render("iodata-long", %{character: character}) end
Benchee.run(
%{
"iodata" => bench_fun,
},
formatters: [
{Benchee.Formatters.Console, extended_statistics: true}
]
)
alias Benchee.Benchmark.Collect.NativeTime
fun = fn _i -> {time, _} = NativeTime.collect(bench_fun); time end
1..1000
|> Enum.map(fun)
|> Enum.take(-50)
|> IO.inspect()
1..1000
|> Enum.map(fun)
|> Enum.take(-50)
|> IO.inspect()
and the output I get is quite fine and looks okay:
Operating System: macOS
CPU Information: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
Number of Available Cores: 12
Available memory: 32 GB
Elixir 1.10.3
Erlang 22.3
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 7 s
Benchmarking iodata...
Name ips average deviation median 99th %
iodata 1.34 M 746.69 ns ±3536.53% 494 ns 1973 ns
Extended statistics:
Name minimum maximum sample size mode
iodata 419 ns 13294633 ns 1.13 M 467 ns
[1969, 1998, 1995, 2024, 3585, 1982, 2034, 2081, 1968, 3950, 1996, 2100, 3756,
2047, 1990, 1990, 2105, 2088, 1961, 2021, 3814, 2015, 2052, 1976, 1962, 2006,
1980, 3812, 1985, 1975, 2024, 1961, 1989, 1963, 1958, 3849, 1977, 1993, 2019,
1973, 1975, 2034, 1956, 3849, 1975, 2003, 1999, 1988, 1986, 1999]
[3568, 1981, 2003, 1982, 1992, 1976, 2041, 2013, 3829, 3557, 2303, 1973, 1996,
1981, 2004, 1997, 3775, 2017, 1997, 1988, 2001, 1979, 2045, 4152, 1990, 1989,
2007, 1980, 1965, 1982, 2020, 4032, 1983, 1957, 1982, 1965, 1996, 1973, 2010,
3780, 1986, 1991, 1993, 1980, 1984, 1982, 1997, 3800, 1966, 1978]
Albeit earlier I got this:
Not sure what gives.
which now... makes me think that maybe after all it is the function call overhead measurements that are messing with things.
Hello, I'm just running into this issue today on a 2018 15" Macbook Pro.
I am running Benchee 1.0.1
and
Elixir 1.11.4
Erlang 23.3.1
I am getting graphs like the ones posted here, with many 0us or 1000ns results.
It looks like the Erlang issue is resolved in my version of Erlang.
Monotonic time results are consistent for me across 5 iex sessions in a row, it looks like I have a lower resolution than the OP:
iex(8)> :erlang.system_info(:os_monotonic_time_source)
[
function: :clock_gettime,
clock_id: :CLOCK_MONOTONIC,
resolution: 1000000,
extended: :no,
parallel: :yes,
time: 494623835664000
]
iex(9)> :erlang.monotonic_time()
-576460731659128000
iex(10)> :erlang.monotonic_time()
-576460731045460000
iex(11)> :erlang.monotonic_time()
-576460730387654000
Here are my results for strings_bench using your edits to the script:
Please let me know if I can assist with this further, I'm glad to help out.
This was FINALLY solved in #339
Finally got around to verifying this once I got a hand on a Mac, I'll add more details from my own computer later but the gist is this:
Problem
In whatever circumstance that I don't understand
:erlang.monotonic_time()
only answers in microsecond precision while acting like nanosecond. The impact is that we get measurements that aren't precise enough and as a result are reporting weird measurements/medians of 0ns or what not.Here you can see the
monotonic_time()
behaviour changing, between answering in thousands (microseconds) and precise (nanoseconds)and yes, the native time unit claims to be
:nanosecond
(no change when converting native to nanosecond):as a result our measurements can look like this and produce very odd results:
Background
See this twitter thread, basically I noticed odd looking benchmarking results in this blog post (a median of 0ns). The benchmark can be found here
Further Debugging
After looking into the Erlang Time Correction docs I hoped that the os_monotonic information might be different but apart from the bad time values it seems to be the same between Mac and my linux machine:
I also investigated time_warp_mode but results on the mac didn't change.
Opened an issue at Erlang/OTP: https://bugs.erlang.org/browse/ERL-1067
Possible Solutions
Workarounds possible now
Environment
Values obtained from a MBP from ~2014/2015 running 10.14.2 Mojave on an i5 processor
Is this a benchee bug?
Yes and No. Yes the results reported in the end aren't accurate, however I'd say that this time reporting with another precision than reported is a bug in the BEAM or maybe MacOS or I dunno what.
We still need to fix it/make it better as lots of people out there are doing benchmarks on MacOS.
Can I help?
If you have a mac you can run the benchmark and post your results: https://github.com/oestrich/strings_bench
If you could run an
iex
shell and run:erlang.monotonic_time()
a couple of times, kill the session and run it again seeing if the precision changes as seen above, that'd be nice. I'd like to see if this happens all the time/for everything or just specific configurations.