bencheeorg / benchee

Easy and extensible benchmarking in Elixir providing you with lots of statistics!
MIT License
1.41k stars 66 forks source link

High memory consumption on some cases #389

Open lud-wj opened 1 year ago

lud-wj commented 1 year ago

Hello,

I see when running the following code that input "single with case "head_match" will consume all my RAM, all my swap, and will get OOM killed.

Same case with input "large" is fine, as well as all other cases with any input.

It seems that the problem always arises after the last input/scenario, not matter which one I comment.

inputs = %{
  "large" => Map.new(1..33, fn n -> {:"key_#{n}", []} end),
  "single" => %{key_1: []}
}

Benchee.run(
  %{
    "body_match" => fn map ->
      %{key_1: value} = map
      value
    end,
    "case" => fn map ->
      case map do
        %{key_1: value} -> value
      end
    end,
    "dot" => fn map ->
      map.key_1
    end,
    "head_match" => fn
      %{key_1: value} ->
        value
    end
  },
  inputs: inputs,
  pre_check: true,
  time: 5
)

I have no idea why.

Maybe a bug?

Edit: My suspicion is that is it such a stupid and useless benchmark that the functions just runs too many times and generate too much measurements. In which case just close the issue <3

PragTob commented 10 months ago

:wave:

Thanks for reporting an issue! :green_heart:

Sorry for the long silence!

Also, yes that's likely - they run so fast that so many measurements are generated in memory that it breaks depending on how much RAM you got.

Doesn't mean we can't be better (this has happened before), we could offer something like max_measurements to avoid and short cut these kinda problems :thinking:

I'll run it later just to make sure. Ok ran it now. From the looks yeah it's definitely just too many samples:

The remedy right now is to just reduce the time:

inputs = %{
  "large" => Map.new(1..33, fn n -> {:"key_#{n}", []} end),
  "single" => %{key_1: []}
}

Benchee.run(
  %{
    "body_match" => fn map ->
      %{key_1: value} = map
      value
    end,
    "case" => fn map ->
      case map do
        %{key_1: value} -> value
      end
    end,
    "dot" => fn map ->
      map.key_1
    end,
    "head_match" => fn
      %{key_1: value} ->
        value
    end
  },
  inputs: inputs,
  pre_check: true,
  time: 1,
  warmup: 0.1,
  formatters: [{Benchee.Formatters.Console, extended_statistics: true}]
)

Which gets me this output:

``` Operating System: Linux CPU Information: AMD Ryzen 9 5900X 12-Core Processor Number of Available Cores: 24 Available memory: 31.25 GB Elixir 1.15.4 Erlang 26.0.2 Benchmark suite executing with the following configuration: warmup: 100 ms time: 1 s memory time: 0 ns reduction time: 0 ns parallel: 1 inputs: large, single Estimated total run time: 8.80 s Benchmarking body_match with input large ... Benchmarking body_match with input single ... Benchmarking case with input large ... Benchmarking case with input single ... Benchmarking dot with input large ... Benchmarking dot with input single ... Benchmarking head_match with input large ... Benchmarking head_match with input single ... ##### With input large ##### Name ips average deviation median 99th % head_match 31.18 M 32.07 ns ±37.93% 30 ns 40 ns case 30.84 M 32.43 ns ±37.84% 30 ns 40 ns dot 30.81 M 32.46 ns ±139.96% 30 ns 50 ns body_match 30.13 M 33.19 ns ±38.29% 30 ns 40 ns Comparison: head_match 31.18 M case 30.84 M - 1.01x slower +0.36 ns dot 30.81 M - 1.01x slower +0.39 ns body_match 30.13 M - 1.04x slower +1.13 ns Extended statistics: Name minimum maximum sample size mode head_match 20 ns 4590 ns 2.22 M 30 ns case 20 ns 4810 ns 2.22 M 30 ns dot 20 ns 64761 ns 2.22 M 30 ns body_match 20 ns 4780 ns 2.22 M 30 ns ##### With input single ##### Name ips average deviation median 99th % body_match 35.04 M 28.54 ns ±42.13% 30 ns 40 ns head_match 34.98 M 28.59 ns ±49.44% 30 ns 40 ns case 34.97 M 28.60 ns ±42.11% 30 ns 40 ns dot 34.80 M 28.74 ns ±43.26% 30 ns 40 ns Comparison: body_match 35.04 M head_match 34.98 M - 1.00x slower +0.0430 ns case 34.97 M - 1.00x slower +0.0530 ns dot 34.80 M - 1.01x slower +0.197 ns Extended statistics: Name minimum maximum sample size mode body_match 20 ns 4510 ns 2.22 M 30 ns head_match 20 ns 10491 ns 2.23 M 30 ns case 20 ns 4491 ns 2.22 M 30 ns dot 20 ns 10200 ns 2.23 M 30 ns ```

You can see that's 2.22M samples per scenario and 8 scenarios oh my. The time it actually blows up is during the statistics calculation I think (with the longer times) and part of that is that instead of implementing the super hard way to get some stats the list is actually sorted :sweat_smile: So, could also try make that better although benchmarks say that for normal cases it's fine.

Again, thanks for the report!

lud-wj commented 10 months ago

Hey, thanks for the confirmation!

max_measurements could be nice but really my benchmark does not do any actual work, I'm not sure that would be a really useful feature :)

PragTob commented 10 months ago

I mean it has happened before and you ran into an issue because of it - so I think benchee as a tool can do something better and so I'll see if I can easily implement it which should be possible. Might need to start tracking the measurement count but that shouldn't be tooo bad.

lud-wj commented 10 months ago

Alright! I don't know the benchee code but indeed it should be simple enough :)

PragTob commented 10 months ago

Looking at the code, part of the problem is probably also that we calculate all the statistics in parallel for each scenario :joy:

  def statistics(suite) do
    percentiles = suite.configuration.percentiles

    update_in(suite.scenarios, fn scenarios ->
      Parallel.map(scenarios, fn scenario ->
        calculate_scenario_statistics(scenario, percentiles)
      end)
    end)
  end

Which... for a normal number of scenarios this is probably fine but overall not the greatest and may be good to optionally turn off :joy:

PragTob commented 10 months ago

Same thing for formatters :sweat-smile:

lud-wj commented 10 months ago

If there is a max_concurrency option or something like that it could be nice to expose it indeed.

PragTob commented 9 months ago

I had a minor hope that #408 would maybe put a dent into this one, but it doesn't which makes sense as the inputs used here are relatively small. That said I haven't forgotten about this one, but the other ones have been more prevalent :sweat_smile: