dotnet / BenchmarkDotNet

Powerful .NET library for benchmarking
https://benchmarkdotnet.org
MIT License
10.61k stars 972 forks source link

IntroArgumentsSource sample does not work for me #2195

Open KenSykes opened 2 years ago

KenSykes commented 2 years ago

Ran into trouble adding ArgumentsSource to my benchmark and found that even the sample code does not generate the expected results (note the "?")

        | Method | time | x | y | Mean | Error | StdDev |
        | --------------- | ----------------- | --- | --- | -------------------:| ----------------:| ----------------:|
        | SingleArgument | 00:00:00.0100000 |  ? |  ? | 15,780,658.958 ns | 53,493.3152 ns | 50,037.6802 ns |
        | SingleArgument | 00:00:00.1000000 |  ? |  ? | 110,181,308.000 ns | 517,614.3512 ns | 484,176.7852 ns |
        | ManyArguments |                ? | 1 | 1 | 3.135 ns | 0.0852 ns | 0.1326 ns |
        | ManyArguments |                ? | 2 | 2 | 13.571 ns | 0.2180 ns | 0.1933 ns |
        | ManyArguments |                ? | 4 | 4 | 13.478 ns | 0.2188 ns | 0.1940 ns |
        | ManyArguments |                ? | 10 | 10 | 13.471 ns | 0.2294 ns | 0.2034 ns |

The following warning is shown:

// * Warnings *
MultimodalDistribution
  IntroArgumentsSource.ManyArguments: Default -> It seems that the distribution is bimodal (mValue = 3.44)

This occurs with .net5 or .net6 console app, built with VS 2022, using BenchmarkDotNet 0.13.2 (currently the latest stable release)

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using CommandLine;
using System;
using System.Collections.Generic;
using System.Threading;
using System.Timers;

namespace BenchmarkBug
{
    public class IntroArgumentsSource
    {
        [Benchmark]
        [ArgumentsSource(nameof(Numbers))]
        public double ManyArguments(double x, double y) => Math.Pow(x, y);

        public IEnumerable<object[]> Numbers() // for multiple arguments it's an IEnumerable of array of objects (object[])
        {
            yield return new object[] { 1.0, 1.0 };
            yield return new object[] { 2.0, 2.0 };
            yield return new object[] { 4.0, 4.0 };
            yield return new object[] { 10.0, 10.0 };
        }

        [Benchmark]
        [ArgumentsSource(nameof(TimeSpans))]
        public void SingleArgument(TimeSpan time) => Thread.Sleep(time);

        public IEnumerable<object> TimeSpans() // for single argument it's an IEnumerable of objects (object)
        {
            yield return TimeSpan.FromMilliseconds(10);
            yield return TimeSpan.FromMilliseconds(100);
        }
    }

    internal class Program
    {
        static void Main(string[] args)
        {
            BenchmarkRunner.Run<IntroArgumentsSource>();
        }
    }
}
AndreyAkinshin commented 2 years ago

even the sample code does not generate the expected results (note the "?")

Question marks are displayed for the argument values that are not defined for the given benchmarks. The benchmark ManyArguments has two arguments: x and y; the benchmark SingleArgument has a singled argument time. Meanwhile, we should display the results within the same table, therefore all columns x, y, and time should be presented. However, x and y are not defined for SingleArgument; time is not defined for ManyArguments. In order to present such not defined values, we display ?.

I guess, this notation confuses you. Which cell value would you prefer for such non-defined values to make the summary table more obvious?

The following warning is shown

This is a valid behavior. We collected multiple measurements for ManyArguments. The internal math engine has detected that the underlying distribution of these measurements is bimodal, therefore you can't rely on the Mean value as the primary metric (you can find the full list of measurements in the logs or in the export results). The multimodality detection is currently based on mvalues, and this approach sometimes detects multimodality in some cases that are not practically important. I'm working on the adoption of a new approach that would improve this particular diagnostic.

What confuses you about this message?

KenSykes commented 2 years ago

Thanks for the quick reply. The first thing that is confusing is the sample shows different output in your documentation (no question marks): https://benchmarkdotnet.org/articles/samples/IntroArgumentsSource.html

The warning didn't confuse me, just included that in case it was relevant.

The problem with the ? is it appears to affect Ratio calculations against a baseline which is the use case I need. Adding the baseline parameter here:

        [Benchmark(Baseline = true)]
        [ArgumentsSource(nameof(Numbers))]
        public double ManyArguments(double x, double y) => Math.Pow(x, y);

Produces ? in the Ratio column:

|         Method |             time |  x |  y |               Mean |           Error |          StdDev |             Median | Ratio | RatioSD |
|--------------- |----------------- |--- |--- |-------------------:|----------------:|----------------:|-------------------:|------:|--------:|
| SingleArgument | 00:00:00.0100000 |  ? |  ? |  15,563,991.071 ns |  38,670.3229 ns |  34,280.2340 ns |  15,570,271.875 ns |     ? |       ? |
|                |                  |    |    |                    |                 |                 |
    |       |         |
| SingleArgument | 00:00:00.1000000 |  ? |  ? | 108,953,360.000 ns | 476,906.2206 ns | 446,098.3747 ns | 108,789,120.000 ns |     ? |       ? |
|                |                  |    |    |                    |                 |                 |
    |       |         |
|  ManyArguments |                ? |  1 |  1 |           2.723 ns |       0.0471 ns |       0.0440 ns |           2.731 ns |  1.00 |    0.00 |
|                |                  |    |    |                    |                 |                 |
    |       |         |
|  ManyArguments |                ? |  2 |  2 |          12.909 ns |       0.2732 ns |       0.3648 ns |          12.832 ns |  1.00 |    0.00 |
|                |                  |    |    |                    |                 |                 |
    |       |         |
|  ManyArguments |                ? |  4 |  4 |          13.062 ns |       0.2836 ns |       0.5919 ns |          12.771 ns |  1.00 |    0.00 |
|                |                  |    |    |                    |                 |                 |
    |       |         |
|  ManyArguments |                ? | 10 | 10 |          12.788 ns |       0.2701 ns |       0.2109 ns |          12.727 ns |  1.00 |    0.00 |

But I see why that makes sense in this case now. Let me see if I can fix this in my real use case..

timcassell commented 2 years ago

Looks like the documentation just needs to be updated. Your output looks correct to me.

YegorStepanov commented 2 years ago

I guess, this notation confuses you. Which cell value would you prefer for such non-defined values to make the summary table more obvious?

? asks the user something or looks like an inner engine error, not like a normal value. Maybe it should be a - sign? ? displays for null values also. #2122 fixes it.

How to distinguish "not defined sign"(?/-) from correct string?

//pseudo code
[ArgumentSource("?")] public string A; //for Benchmark1
[ArgumentSource("abc")] public string B; //for Benchmark2

the table will look like

|  A  |  B     |
|  ?  |  ?     |
|  ?  |  abc   |

Produces ? in the Ratio column:

This is independent from arguments and argument's ? marks.

YegorStepanov commented 2 years ago

Produces ? in the Ratio column:

Look at the empty rows with | | | |. They are mean benchmark group. Ratio is calculated for benchmark group. (not perfect as you can see)

You can try to do this:

[BenchmarkCategory("A")]
[Benchmark] public void B1() { }
[BenchmarkCategory("A")]
[Benchmark] public void B2() { }
KenSykes commented 2 years ago

Thanks everyone I think this answered my question well enough. It would be good to update the docs so the output matches.

adamsitnik commented 2 years ago

It would be good to update the docs so the output matches.

@KenSykes Please send a PR with an update, this is the file that needs to be updated.