ocurrent / current-bench

Experimental benchmarking infrastructure using OCurrent pipelines
Apache License 2.0
33 stars 17 forks source link

Less verbose output from successful cb-check #451

Open 3Rafal opened 1 year ago

3Rafal commented 1 year ago

Resolves #450

3Rafal commented 1 year ago

@ElectreAAS

I agree that the full output is too verbose, but this might not be verbose enough. Most people will see Correctly parsed following benchmarks: unnamed because the top-level "name" field is optional.

Yes, I agree that the unnamed is pretty bad. Why are the empty benchmark names allowed?

Alos, I'm wondering if I understand the cb-check use cases correctly. AFAIK it currently works like this:

  1. You can pass multiple benchmark objects inside one json file
  2. They will be first parsed, and then merged
  3. Then you get a list of json that error out, and then list of jsons that were parsed (and merged).

I find following a bit confusing:

Becaus of merging, I think that printing whole outputs might actually be better, because it is non-intuitive. Maybe me should add some doc/readme note about this?

art-w commented 1 year ago

Why are the empty benchmark names allowed?

Most projects have only one benchmark battery and so don't specify a toplevel name (I'm fine with "unnamed" personally, or "default")

Why do we allow passing multiple benchmarks?

If one does specify different names, then it creates multiple tabs in the sidebar on the left of the website (so basically, it allows creating multiple benchmark batteries that don't pollute the main page) I don't think this is used much atm, but you can imagine running multiple programs in make bench and wanting their outcome to be separate.

What's the point of merging? What are the merging rules?

Yeah sorry this isn't documented much either... The intent is that a benchmark can produce each metric one by one in a streaming way, as soon as it computes it, rather than accumulate and print them all at the end:

{ "results": [ { "name": "mybench", "metrics": { "foo": "12.34s" } } ] }
... any other debug log output ...
{ "results": [ { "name": "mybench", "metrics": { "bar": "56.78s" } } ] }

This should give you the same as if you printed a single output:

{ "results": [ { "name": "mybench", "metrics": {  "foo": "12.34s", "bar": "56.78s" } } ] }

If you print multiple times the same metric, then the results gets accumulated to compute their min/mean/max (and display the error margin on the graphs). This enables running the same benchmark in a for loop if more precision is needed.

Finally, by printing metrics asap, the website can display the updated graphs while the benchmark is still running :) (so by running the important metrics first, one does not need to wait for the full benchmark to complete before seeing the impact of the changes)

3Rafal commented 1 year ago

Thanks for great answer @art-w ! Can we add some of these informations to readme? I think it would make new user experience much better. :)