File output for benchmark results.

OliverKillane commented 8 months ago

Goals

[x] Generate a representation of the tree of benchmarks that could be converted to different file types. Done with StatTree that can be passed to file outputs after all benches are complete
[x] Ensure the tree structure generated always matches the tree structure printed by the TreePainter Wrapped the tree painter with the StatsCollector, calls to make parents in StatsCollector propagate to both TreePainter and the tree construction
[x] Make minimal code change to TreePainter, Stats representations to simplify review
[x] Generate json (nested, and flat) output as mvp, writable to a provided file path, leaving space to implement other outputs (markdown, CSV, html).
[x] Concatenate to the specified file (using https://jsonlines.org/) to allow multi-benchmark runs.
[x] Add description of append to help menu, and including description in both json outputs.

Closes #10

Usage: 

Arguments:
  [FILTER]...  Only run benchmarks whose names match this pattern

Options:
      --test                           Run benchmarks once to ensure they run successfully
      --list                           Lists benchmarks

    ...

      --file <PATH>                    Set the file path to output the results to [env: DIVAN_WRITE_FILE=]
      --format <FORMAT>                Set output file type [env: DIVAN_FILE_FORMAT=] [possible values: json, jsonflat]
  -h, --help                           Print help

Similar to PR #29 However:

Addressing @polarathene 's comment - not tied to json, easy to implement other formats.
Includes alloc_tallies information

Potential Improvements

CSV output, HTML output with graphs? or markdown table output to support (a prerequisite for #14 )
Current code for Stats is messy / uses a custom enum_map for counters and alloc_tallies, can be simplified (decided not to this PR to reduce change size (I'm lazy) )
Refactor the TreePainter code given we can now merge together some of the methods for starting and finishing leaves, and optionally track depth on the stack, rather than passing around a single depth counter. (once again, minimal change via laziness)

Don't think the above are enough to avoid PR approval, however:

Changes

Even when not selecting to output a file, a tree of stats is built (performance regression)
A --format and --file options provided to specify file. File path must be specified (rather than placing in OUT_DIR as with #29

OliverKillane commented 8 months ago

@nvzqz I have some free time this weekend to make any changes to this PR you'd like.

nopeslide commented 7 months ago

Instead of overwriting the specified file on each run, how about treating it as an append-only log? in case of json as jsonlines (json representation stripped of whitespaces, occupying a single line)

This would solve your filename problem by introducing a simple single-file format for collecting multiple benchmark results. Further enabling down the road:

trivial merging multiple partial benchmarks (sth criterion can not)
Paired benchmark results
performance history

cameronelliott commented 7 months ago

@nopeslide said:

Instead of overwriting the specified file on each run, how about treating it as an append-only log? in case of json as jsonlines (json representation stripped of whitespaces, occupying a single line)

This would solve your filename problem by introducing a simple single-file format for collecting multiple benchmark results. Further enabling down the road:

trivial merging multiple partial benchmarks (sth criterion can not)

Paired benchmark results

performance history

I really like this idea, as long as the file is truncated at the beginning of the multi-bench run:

# first run
cargo bench -q -p examples --  --file=result.json
# second run
cargo bench -q -p examples --  --file=result.json
# result.json should only contain the results from the 2nd run,
# as the 2nd run truncated the file prior to running the multi-bench series.

I am not sure if you have the state or necessary data to do the truncation a single time at the start of the run, but, hopefully you do.

I think is a good suggestion also from the point of view of dropping all the data into a single file rather than multiple file names based upon the test-name etc. It would make finding/processing/making-sure-you-got-all-results easier to make deterministic and robust. I feel having to scan the file system for multiple output .json files would be just a little bit brittle, and maybe a little complicated for users trying to read json results.

I think it would be surprising to users if the file did not get truncated at first, and would continue to grow throughout multiple independent command-line invocations. (This would be very unusual behavior for a tool that outputs Json, and might confuse or trip-up users not super familiar with divan)

Thank you for writing the PR @OliverKillane

cameronelliott commented 7 months ago

I cloned the branch by @OliverKillane and made the mods to do the append: let file = OpenOptions::new().append(true).write(true).create(true).open(path)?;

But, it doesn't appear there is a hook or place where information about the first invocation of the series of benches is available. So it might just be the case that the best way forward is append without, sadly truncation on a single multi-bench run.

This would be a little bit unusual in my opinion, but it's not the end of the world. And I think it's a big step up from having multiple output files.

OliverKillane commented 7 months ago

Thanks for the comments @nopeslide and @cameronelliott .

One file is simpler and imo therefore better.

Not keen on having history recorded in a benchmarks output, the goal of the file output should be to be simple and make parsing by whatever scripts the user has on the output end as easy as possible.
When appending to the same json, we need some way to identify benches by name (e.g. json is { "benchname1" : {..ben data..}, "benchname2" : ... }, requires us to either re-parse, or erase the last }, and , "{benchname}" : { write output and then close brackets.

@nvzqz has yet to comment, so I'll hold off having fun with changes to this PR until there is an agreed output/design. In the meantime we can all enjoy out own forks , or someone add yet another file output PR 😂.

cameronelliott commented 7 months ago

A quick note, maybe you mentioned this above, or are already aware of it:

When I do a clean checkout of your fork @OliverKillane and run like so, I get an error about Unrecognized option: 'file', and bench failed.

git clone https://github.com/OliverKillane/divan divan-xxx
cd divan-xxx
git checkout -b fout origin/enh/file-output
cargo bench -q -p examples -- --file=x.x

[output removed]
╰─ now                           │               │               │               │         │
   ├─ instant      13.01 ns      │ 13.58 ns      │ 13.05 ns      │ 13.05 ns      │ 100     │ 12800
   ├─ system_time  13 ns         │ 13.08 ns      │ 13.04 ns      │ 13.04 ns      │ 100     │ 12800
   ╰─ tsc (x86)    6.413 ns      │ 6.476 ns      │ 6.433 ns      │ 6.436 ns      │ 100     │ 25600

Results written to x.x
error: Unrecognized option: 'file'
error: bench failed, to rerun pass `-p examples --bench util`

I haven't looked into it further, so far.

Maybe I am overlooking something obvious?

OliverKillane commented 7 months ago

@cameronelliott I am getting this also. Adding to PR tasks

# produces result, but also 'error: Unrecognized option: 'file'' message
cargo bench -q -p examples -- --file=x.x  
# perfectly fine
DIVAN_WRITE_FILE=x.x cargo bench -q -p examples

OliverKillane commented 7 months ago

@cameronelliott removing from PR description this is a bug for other divan options.

Example:

git switch main
cargo bench -p examples -- --bytes-format=decimal  

...output
error: Unrecognized option: 'bytes-format'
error: bench failed, to rerun pass `-p examples --bench util`

Root Cause:

When passing arguments by -- ..args.. arguments are passed to all benchmark binaries (run benchmarks without -q to see the binaries being run). When passing as environment variable, only the benchmark binaries that actually want this (the divan benchmarks) read it, therefore none error.

After all the divan benchmark binaries are run, one more runs:

Running benches/util.rs (target/release/deps/util-276e87b7c0578c78)`

This should not be run, in fact the comment says so (here).

Because utils has no divan main, it takes no such arguments, so it fails for any extra arguments (including --file we provide).

Solution

See #47

cameronelliott commented 7 months ago

Hey @OliverKillane, you probably saw already, but I submitted a PR to your repo with these features: (simple, but maybe helpful) https://github.com/OliverKillane/divan/pull/1

Append to the output files rather than truncating when opening.
Use the https://jsonlines.org/ format (also called newline-delimited JSON).

OliverKillane commented 7 months ago

I see @cameronelliott & I've approved.

We still might want to add the following:

[ ] truncate file when new group of benchmarks are run (tricky as you mentioned previously)
[ ] include benchmark binary name in file? (Though they already have the benchmark name)
[ ] Add jsonlines output explainer to json help description, or document somewhere for users

I need to sleep now. Thanks for the contrib

cameronelliott commented 7 months ago

When appending to the same json, we need some way to identify benches by name (e.g. json is { "benchname1" : {..ben data..}, "benchname2" : ... }, requires us to either re-parse, or erase the last }, and , "{benchname}" : { write output and then close brackets.

I think the current json format is fine. Here is a screen shot from my test run:

The Json lines seem to match the output folder, except for

util.rs (no benchmarks)
hash.rs, image.rs, scrath.rs (need a feature)

examples/benches:

c@intel12400 ~/r/divan-cam (fout)> ls -1 examples/benches/
atomic.rs
collections.rs
hash.rs
image.rs
math.rs
memcpy.rs
panic.rs
README.md@
scratch.rs
search.rs
sort.rs
string.rs
threads.rs
time.rs
util.rs

cameronelliott commented 7 months ago

I see @cameronelliott & I've approved.

We still might want to add the following:

[ ] truncate file ...

[ ] include benchmark binary name in file? ...

[ ] Add jsonlines output explainer to json help ...

I need to sleep now. Thanks for the contrib

Re: truncation. From my perspective, I say let's skip it. I didn't understand the situation: It's likley going to be very difficult or a hack to do this. I just don't think we have the information available on when to do a truncation, and we don't get called once for startup we just get called for each individual bench.

Re: include benchmark binary name in file. I'd also vote to skip this too. I am not sure of the value, and this could introduce gotchas. End-users should be able to correlate running a bench with the json output, they control the output file names.

Re: Add jsonlines output explainer to json help Yeah, this is important.

@OliverKillane Nice work! This PR is just what I was looking for! @nopeslide Thanks for the suggestions, and pointing out the spec for Jsonlines. Nice.

We got some Json! Onward and upward! 🚀

cameronelliott commented 7 months ago

Hey @nopeslide @OliverKillane, I tried running the Json output through a Json to Rust type creator. It works, but the types are based upon the names of the benchmarks. This means for every different benchmark you run, if you use a Json -> Rust types translator, you will get a different set of Rust types. The same holds for Go, Java, Js, etc etc.

This basically means you cannot use strongly typed structs for manipulating the data on the reader/ingress side. This makes working with the Json in memory difficult and unwieldy.

This means you can't use schema converters as is typically done in Rust/Js/Go/etc/etc

Tomorrow I should wrap a PR proposal that output using a single type, so schema converters can work. This also makes CSV output it trivial. (I will include this)

If we go with the current tree output:

Json to schema converters don't work over multiple benchmarks rows.
End-users cannot compile in strong types for the benchmark 'rows' into Rust, Go, etc.
CSV output is going to be a whole different beast (CSV can't do trees)
Consumers must use Json node trees, and not strong types from schema to type converters.

My proposal will allow:

Allow schema converters to produce a single canonical type from sample Json
Allow Rust, Java, Js, etc, etc consumers to compile in a single strong type for benchmark rows.
Allow me to output CSV in a matter of minutes, without a seperate project to flatten a tree to rows.
Having a vector of benchmark results not a Json tree structure
Not having to navigate a tree for consumers.

Hopefully you are both okay to review this and give me feedback when I get it ready (~24 hours)

OliverKillane commented 7 months ago

@cameronelliott Keeping representing as a tree allows us to keep nesting information easily available and is main difference with this PR from #29

Divan benchmarks can be nested at different depths, and by parameters, when converting to a flat approach (e.g. vector of rows as you've suggested) the benchmark key becomes the issue:

Representing by string - extracting parameters from the benchmark name is awkward
Representing key as a vector of enums of parameters in order- "key" : [{"name" : "bench1", "threads" : 3, "other_param": 4}], "result" {...} requires re-building the tree on the side of say, a python script that wants to graph threads vs iterations.

If we want easy parsing to rust input as appears to be your rust test case, then I suggest the following:

Include a feature "rust_trees"
When cfg for feature, crate exports the StatTree
Implement deserialize for StatTree
Now library users can produce files, and just use divan::StatTree to reparse benchmarks into rust.
For CSV output, tree flattening has to occur, and information is inevitable made more difficult (column of stringified names & parameters)

For other outputs, e.g. html view, nesting is also important.

cameronelliott commented 7 months ago

I agree the straight json trees can have a lot of value.

But! I also think flattened json can have a lot of value too, many developers use json to types converters. json_typegen json-to-go

Would you accept a PR or potentially accept a PR with an additional format 'json_flattened' ?

This way developers/consumers would have their choice about the easiest way for them to consume the json:

as a tree which needs to be navigated using a json tree navigation library
or as an vec/array of rows of benchmark data using concrete types for divan.

OliverKillane commented 7 months ago

It really depends how @nvzqz wants this done.

I think keeping the internal representation as a tree, and just flattening for a flat output (as will be needed for csv in future) is best. I see no issue in flat and nested json output, tool needs to do what is needed -> I need nested, and you need flat

Flattening the tree is super easy, so I've pushed this:

Names are concatenated with dots See

cargo bench -q -p examples -- --file=x.x --format=jsonflat

Can see the keys, here from python


# in interpreter
import json
lines = []
with open('examples/x.x', 'r') as f:
for l in f: # each line is json
        lines.append(json.loads(line))

lines[0]['benchmarks'].keys()

To get:

dict_keys(['atomic.basic.load.t=1', 'atomic.basic.load.t=16', 'atomic.basic.load.t=20', 'atomic.basic.load.t=4', 'atomic.basic.store.t=1', 'atomic.basic.store.t=16', 'atomic.basic.store.t=20', 'atomic.basic.store.t=4', 'atomic.compare_exchange.fetch_div.t=1', 'atomic.compare_exchange.fetch_div.t=16', 'atomic.compare_exchange.fetch_div.t=20', 'atomic.compare_exchange.fetch_div.t=4', 'atomic.compare_exchange.fetch_mod.t=1', 'atomic.compare_exchange.fetch_mod.t=16', 'atomic.compare_exchange.fetch_mod.t=20', 'atomic.compare_exchange.fetch_mod.t=4', 'atomic.compare_exchange.fetch_mul.t=1', 'atomic.compare_exchange.fetch_mul.t=16', 'atomic.compare_exchange.fetch_mul.t=20', 'atomic.compare_exchange.fetch_mul.t=4', 'atomic.update.fetch_add.t=1', 'atomic.update.fetch_add.t=16', 'atomic.update.fetch_add.t=20', 'atomic.update.fetch_add.t=4', 'atomic.update.fetch_and.t=1', 'atomic.update.fetch_and.t=16', 'atomic.update.fetch_and.t=20', 'atomic.update.fetch_and.t=4', 'atomic.update.fetch_nand.t=1', 'atomic.update.fetch_nand.t=16', 'atomic.update.fetch_nand.t=20', 'atomic.update.fetch_nand.t=4', 'atomic.update.fetch_or.t=1', 'atomic.update.fetch_or.t=16', 'atomic.update.fetch_or.t=20', 'atomic.update.fetch_or.t=4', 'atomic.update.fetch_sub.t=1', 'atomic.update.fetch_sub.t=16', 'atomic.update.fetch_sub.t=20', 'atomic.update.fetch_sub.t=4', 'atomic.update.fetch_xor.t=1', 'atomic.update.fetch_xor.t=16', 'atomic.update.fetch_xor.t=20', 'atomic.update.fetch_xor.t=4'])

cameronelliott commented 7 months ago

@OliverKillane thanks for doing that.

I've submited a PR to you to do the following:

make changes to enable the dozens or hundreds of json to type converter tools, so users of Python, R, Julia, Matlab, can easily ingest divian benchmarks using concrete types for Divan results. See reasoning below.
keep the test name path as a json array. ["atomic","basic","load","t=1"] rather than a string "atomic.basic.load.t=1". Seems better for consumers to me.

what needs to happen to support the dozens or hundreds of tools for converting json to concrete types for languages like Python, Go, Rust, Js, Ts, etc, etc? : Json uses name/value pairs. If the test name path gets stored in the 'name' part of json. End-users can not practically use tools to convert json to their language types.

cameronelliott commented 7 months ago

To others reading along, I think the json flat output could use more review and possibly more tweaks. There is the option of star or snowflake like schemas, or just flattening down the json a little more.

But what is in this PR, if @OliverKillane accepts my PR to his fork, is now usable by those who are using json-to-types conversion tools for rapid development, ie, using Go, Python, Rust or a dozen other languages.

It would be good to hear from the maintainers on vision and goals before investing more time into this. (And maybe merging this PR first)

Big thanks to @OliverKillane for pushing this forward, and writing code to do flatjson also.

cameronelliott commented 7 months ago

For those that are curious, attached is an example of the jsonflat format, based upon a pending PR to @OliverKillane.

The attached file has jsonlines format, and so there is 10 lines for the default 10 enabled benchmarks in the example benches:

flat.json

It was generated like so: rm examples/flat.json ; cargo bench -q -p examples -- --file=flat.json --format=jsonflat

Once the json is prettied, it looks like this:

{
    "benchmarks": [
        {
            "path": [
                "atomic",
                "basic",
                "load",
                "t=1"
            ],
            "result": {
                "alloc_tallies": {
                    "alloc": {
                        "count": {
                            "fastest": 0,
                            "mean": 0,
                            "median": 0,
                            "slowest": 0
                        },
                        "size": {
                            "fastest": 0,
                            "mean": 0,
                            "median": 0,
                            "slowest": 0
                        }
                    },
                    "dealloc": {
                        "count": {
                            "fastest": 0,
                            "mean": 0,
                            "median": 0,
                            "slowest": 0
                        },
                        "size": {
                            "fastest": 0,
                            "mean": 0,
                            "median": 0,
                            "slowest": 0
                        }
                    },
                    "grow": {
                        "count": {
                            "fastest": 0,
                            "mean": 0,
                            "median": 0,
                            "slowest": 0
                        },
                        "size": {
                            "fastest": 0,
                            "mean": 0,
                            "median": 0,
                            "slowest": 0
                        }
                    },
                    "shrink": {
                        "count": {
                            "fastest": 0,
                            "mean": 0,
                            "median": 0,
                            "slowest": 0
                        },
                        "size": {
                            "fastest": 0,
                            "mean": 0,
                            "median": 0,
                            "slowest": 0
                        }
                    }
                },
                "counters": {
                    "bytes": null,
                    "chars": null,
                    "items": null
                },
                "iter_count": "409600",
                "sample_count": "100",
                "time": {
                    "fastest": 330,
                    "mean": 350,
                    "median": 353,
                    "slowest": 608
                }
            }
        },
[chopped]

OliverKillane commented 7 months ago

It would be good to hear from the maintainers on vision and goals before investing more time into this. (And maybe merging this PR first)

Amen, I've sent an email to @nvzqz for some input. Thanks for the PRs to this PR @cameronelliott

cameronelliott commented 6 months ago

Two issues have revealed themselves using this PR by @OliverKillane with my mini-PRs for flat-json. (Also, thanks again to Oliver, for submitting this useful PR)

let me say up-front, it would be good to hear from the maintainers before taking more action on this repo from my perspective *
There is no mention of any of units of measure ie, nanoseconds, microseconds, bytes, ops/second, etc, etc, etc (see example output below)
A lot of metrics which are not useful valid data are output that are just default values. (see example output below)

This PR is still usable if the consumer is assuming the correct units, but that seems like a poor possibly problematic assumption in the long term. (let's make robust Json, yeah?)

Idea: It seems like making the units part of the name/value name, might be sufficient. Instead of "median":353, something like "median_nanoseconds":353 would be a good step in my experience.

The output of defaults values can, again, be handled by consumers, but, again, it seems like we could do better by pruning all those default/zero branches and the null results also.

See this example:

{
    "benchmarks": [
        {
            "path": [
                "atomic",
                "basic",
                "load",
                "t=1"
            ],
            "result": {
                "alloc_tallies": {
                    "alloc": {
                        "count": {
                            "fastest": 0,
                            "mean": 0,
                            "median": 0,
                            "slowest": 0
                        },
                        "size": {
                            "fastest": 0,
                            "mean": 0,
                            "median": 0,
                            "slowest": 0
                        }
                    },
                    "dealloc": {
                        "count": {
                            "fastest": 0,
                            "mean": 0,
                            "median": 0,
                            "slowest": 0
                        },
                        "size": {
                            "fastest": 0,
                            "mean": 0,
                            "median": 0,
                            "slowest": 0
                        }
                    },
                    "grow": {
                        "count": {
                            "fastest": 0,
                            "mean": 0,
                            "median": 0,
                            "slowest": 0
                        },
                        "size": {
                            "fastest": 0,
                            "mean": 0,
                            "median": 0,
                            "slowest": 0
                        }
                    },
                    "shrink": {
                        "count": {
                            "fastest": 0,
                            "mean": 0,
                            "median": 0,
                            "slowest": 0
                        },
                        "size": {
                            "fastest": 0,
                            "mean": 0,
                            "median": 0,
                            "slowest": 0
                        }
                    }
                },
                "counters": {
                    "bytes": null,
                    "chars": null,
                    "items": null
                },
                "iter_count": "409600",
                "sample_count": "100",
                "time": {
                    "fastest": 330,
                    "mean": 350,
                    "median": 353,
                    "slowest": 608
                }
            }
        },
[chopped]

cameronelliott commented 6 months ago

One other nit that is worth mentioning, is the names on the internal data structures are not a great fit for some of the output:

            "size": {
              "fastest": 0,
              "mean": 0,
              "median": 0,
              "slowest": 0
            }

The fastest and slowest labels there really should be minimum, maximum

cameronelliott commented 6 months ago

Also, I think it is worthwhile mentioned there is an issue with outputting counters.

Is it possible to get this in the output:

 "counters": {
                    "bytes": null,
                    "chars": null,
                    "items": null
                },

Which is not valid Json.

FilipAndersson245 commented 6 months ago

@nvzqz whats your thought, we would love to use this in nushell.

FilipAndersson245 commented 6 months ago

@cameronelliott I think the sufix should be inside the data somehow?, not to sure how I would fell about it being part of the name.

cameronelliott commented 6 months ago

@cameronelliott I think the sufix should be inside the data somehow?, not to sure how I would fell about it being part of the name.

@FilipAndersson245 I presume you’re talking about the lack of units. Like nanoseconds or MB/s etc.

The problem I see with putting it in the value part of the JSON, is that it would mean turning the JSON numbers into strings. I suppose that’s a valid option. But it means for reading those values becomes extremely more complex on the parsing/reading side. You no longer have ‘20’ but now “20 ns”

FilipAndersson245 commented 6 months ago

@cameronelliott I think the sufix should be inside the data somehow?, not to sure how I would fell about it being part of the name.

@FilipAndersson245 I presume you’re talking about the lack of units. Like nanoseconds or MB/s etc.

The problem I see with putting it in the value part of the JSON, is that it would mean turning the JSON numbers into strings. I suppose that’s a valid option. But it means for reading those values becomes extremely more complex on the parsing/reading side. You no longer have ‘20’ but now “20 ns”

wouldent it be better to have something like

sufix: ns
time: 250

I agree that the time should only be a digit. either that or have an extra column that is always represented in lowest time format, this case ns so you could have time_ns but i would prefer the first alternative.

OliverKillane commented 6 months ago

@FilipAndersson245 @cameronelliott

The information we need for times is:

The unit
The precision

I don't think we need this on a per-result basis

precision is for all benchmarks
simplest: we can just use the same time unit for all

Furthermore I don't think we even need unit selection, just use that divan records in (picoseconds).

Currently this is done in the description and precision fields.

Given the aim of the json output should be for simple, easy to load (i.e. into other scripts) output of benchmarks (not pretty or human readable json). Hence having all times output be picoseconds is the simplest.

FilipAndersson245 commented 6 months ago

I agree that picosecond output would probably be both best and simplest, converting to different units is simple to do on consumer side.

OliverKillane commented 6 months ago

Also, I think it is worthwhile mentioned there is an issue with outputting counters.

Is it possible to get this in the output:
 "counters": {
                    "bytes": null,
                    "chars": null,
                    "items": null
                },
Which is not valid Json.

I think this is actually valid json:

described at json.org
and in the linked ecma 262 standard here
Hence why serde_json supports the JsonValue::Null used to write this output

If this is causing issues with parsers you are using on your end, maybe we can change the format here to appease them?

OliverKillane commented 6 months ago

I agree that picosecond output would probably be both best and simplest, converting to different units is simple to do on consumer side.

This is currently the implementation chosen.

The description is included in all json generated, for example for the json output provides:

{
    "description" : "Generated by Divan: Time is in picoseconds.",
    ...
}

And flatjson provides:

{
    "description" : "Generated by Divan: Time is in picoseconds, only benchmarks with results (not empty or ignored) are present.",
    ...
}

Idea being: the tool's output documents itself, for those to busy to look at the (currently nonexistent) docs Could remove and document elsewhere, no preference from me.

FilipAndersson245 commented 6 months ago

Aha, sorry I missed that :)

RobbieBuxton commented 5 months ago

@nvzqz would love to see this merged!

OliverKillane commented 5 months ago

This PR is dragging on, at 86 days open now. I'm new to open source contribution, so not sure on etiquette/want review but not to be rude about this (@nvzqz might be busy, ill, on holiday?), or if this is not normal?

Any suggestions @cameronelliott @FilipAndersson245 ? Is there a process I've missed?

I'll keep using my fork, but I can't promise to be quick with rebasing against updates in divan.

nvzqz commented 5 months ago

Sorry to everyone for the delay. I've been dealing with other priorities in my life that have made it difficult to spend time on open source.

I have a pending review going that I promise to submit by Sunday. Thanks for the patience. ❤️

OliverKillane commented 5 months ago

@nvzqz no stress, I understand - just good to know its not been lost😂, your review is much appreciated & I'll get to improvements as soon as its submitted!

RobbieBuxton commented 5 months ago

Any chance this still might get looked at? :grinning:

nvzqz commented 3 months ago

@OliverKillane my plan is to push some changes to your branch this week and then do a squash commit with everyone's changes. The route I decided to go with is a bit different from your changes but based on your work. I'm currently leaning towards starting with a single canonical flat JSON format. I'll do an alpha/beta crates.io release to allow folks to play around with the format.

While I recognize the benefit of a hierarchical/recursive/nested structure in dynamic languages, it doesn't seem to work well in most JSON-consuming environments.

nvzqz commented 3 months ago

Divan benchmarks can be nested at different depths, and by parameters, when converting to a flat approach (e.g. vector of rows as you've suggested) the benchmark key becomes the issue:
* Representing by string - extracting parameters from the benchmark name is awkward

* Representing key as a vector of enums of parameters in order- `"key" : [{"name" : "bench1", "threads" : 3, "other_param": 4}], "result" {...}` requires re-building the tree on the side of say, a python script that wants to graph threads vs iterations.

My plan is to have the tree hierarchy end at the function name. So parameters such as thread count (which is dynamic) wouldn't be considered as part of the hierarchy and instead be information in an attribute that could be differentiated across other runs of the same benchmark.

OliverKillane commented 3 months ago

Wonderful! Thanks for working on this @nvzqz

nvzqz commented 3 months ago

@OliverKillane what did you require from the nested hierarchy that caused you to lean towards that direction?

OliverKillane commented 3 months ago

@nvzqz Was just subjective personal preference for making the file output a very simple copy of the terminal output. My very simple use case was divan -> python -> plots

Json Output

Pros

Uncomplicated structure (just grab json, index by each parameter to traverse tree), groups with same parameter are kept in the same node (get all benchmark x, with some constant parameter, for some types is just benches[some_const][type_name][something] for type_name in ...).
Easy to convert to flat format with simple tree traversal, but rebuilding the tree from a flat output is more work

Cons

Most uses, a flat output is easier to manage (just iterate over a list)
Flat output more natural for storing in relational databases (e.g. benchmark name, commit, list of params, columns per counter/timings)

I have no quantitative basis for one design choice over the other, so long as it is simple/easy output, this works.

For the internal `StatTree`

Simple representation as close to bench/terminal output as possible (hence also the invocation of the treepainter - couple invocations to prevent divergence of StatTree and terminal representations) (I was lazy and didn't want to have to write tests to ensure those representations matched up 😂)
Some outputs in future may want to retain the nesting/order (e.g. html output displaying the tree unfold) (depends where you want to take this @nvzqz )
Easy to do tree->flat, flat->tree more work

Conclusion

Its not a lot of code, not precious about this PR, it might be easier to just do a from-scratch with your own ideas. Having some kind of file output is simply a necessity for a lot of potential users.

nvzqz / divan