Closed harendra-kumar closed 3 years ago
s
are inherently better than ms
or μs
or ns
or ps
. Dumping picoseconds means that we do not have to deal with floating point numbers (and their parsing).tasty-bench
both prints and reads CSV, it would be inconvenient to have more than one line per benchmark. I can think about exposing more internals, so that an external client could measure a benchmark with a given number of iterations directly, without a medium of CSV file.secs
is as arbitrary as any other unit. When using ps
we are assuming we would never have to represent time at a lower granularity. If we use seconds, though we deal with floating point numbers we do not assume a precision. Also, secs
is the unit being used in gauge and criterion, so we would not have to change anything in our analysis tools if it were secs
.gauge
provides a --csvraw
option which dumps per sample data vs the --csv
option which combines iterations into one single data point. Would something like that be a possible way to deal with this? I am not stuck because of this, so this is just a suggestion and not a pain point.I don't want to look stubborn, but switching from picoseconds to seconds is a breaking change. How important is it for you? I kinda feel that compatibility with CSV format of criterion
or gauge
is a lost cause anyways.
Generating two incompatible CSV reports is confusing: nothing prevents a user to generate --baseline
with a hypothetical --csvraw
instead of --csv
.
My resistance is partly caused by current architecture, which makes dumping raw samples difficult. But I also think CSV is a poor format for interprocess communications and a likely source of future compatibility issues. I'd prefer to expose more internals, so that clients were able to roll out their own statistical analysis communicating in-process.
How important is it for you?
Not critical, but nice to have for compatibility. Until tasty-bench
is tested and becomes reliable for our existing benchmarking infrastructure we would like to keep gauge
too as a backup option. For that we will have to use separate handling in the analysis/reporting for both the tools, or maybe preprocess the csv file generated by tast-bench to convert the column to seconds.
Regarding (4), in general, a persistent file is a good interface for such purposes, CSV or whatever. We may not want to perform the statistical analysis in real time, instead we may want to process the raw data offline at any time to produce different reports or presentations of the data. For that, some persistent format to store the data is required.
The easiest possible way could be to store all the data points in the CSV. The internal tasty-bench
analysis can just use the last two points for each benchmark to calculate its results. Anyway, as I said earlier this is not critical for me as of now, but something to consider/discuss.
I took a closer look at CSV reports of criterion
and gauge
. It appears that headers in --csv
and --csvraw
modes are imcompatible. For example, the former (both in criterion
and in gauge
) names a column Mean
, while the latter uses time
or cpuTime
. If there were an appetite for a change, I'd rather conform to the more prevalent format; I guess --csvraw
is rarely used without preprocessing.
I've added incantations to fake both formats (headers + measurements in seconds) at the bottom of https://github.com/Bodigrim/tasty-bench#comparison-against-baseline section.
Once you have a Benchmarkable
object, you can explode it and write your own driver to run required metrics with required number of iterations and report them in a required format. If your goal is just to collect raw samples, you cannot really benefit much from defaultMain
and tasty
framework.
Another option is to generalize csvReporter
, so that clients could specify desired format.
Upd.: But this is problematic, because we would not be able to parse data back to perform comparison against baseline.
To sum up, matching --csvraw
columns is a low priority, because it is rarely consumed by humans without preprocessing. Matching --csv
is more sensible, but because of a different statistical model we can only fake it, so not a huge win from synchronizing column names. It could create a dangerous illusion that CSV reports between different frameworks are comparable. The situation with streamly
is very much unique, I believe; I have not seen any other Haskell project with such an extensive benchmark suite and harness.
Wanted to discuss a few minor details about the CSV file format.
seconds
instead ofps
?Copied
,gcBytesCopied
would be more informative.iterations
column reporting the number of iterations and rest of the columns report raw data corresponding to those many iterations as usual. This will allow other tools to do any statistical analysis over the whole series of measurements.