Make result JSON self-contained

alistaire47 commented 2 years ago

Closes #97. An effort to add metadata to the JSON generated by {arrowbench} such that it contains everything required to pass to conbench.record() (like happens in {ursacomputing/benchmarks} right now) such that a directory of results can be handled without maintaining metadata elsewhere, a prerequisite for switching from using run_one() to run_benchmark() in {ursacomputing/benchmarks}.

Notes:

Adds auto_unbox = TRUE, null = 'null' to toJSON() options because {} is not NULL and we don't want lists of length 1 everywhere. This doesn't seem to break anything.
Hardcodes drop_caches = FALSE for now, as that's the default in {benchmarks} and we're not overriding it anywhere AFAIK. We should probably clear them from R and parameterize this, but that felt beyond the scope of this story.
Adds the metadata as attributes to the dataframe form of results because they were getting too wide anyway. Could probably chuck some of the other nonvarying fields in there too, but didn't want to break anything.
This ignores what are called extra_tags in {benchmarks} because I don't see them used anywhere.
I didn't mess with output (stdout for the call), though I think this gets handled differently depending on where it gets called. We'll likely need to revisit how we want it to work.
This schema should be reasonably complete, but it could use some standardization—what benchmarks.Conbench.record(), conbench.record(), and the conbench POST API want are all slightly different. There is also duplication in here; notably params for the case are both in params and tags. A pydantic schema we use everywhere (and recreate in R6?) would be nice.

What the JSON now looks like (lightly abridged):

{
  "name": "array_altrep_materialization",
  "tags": {
    "exclude_nulls": false,
    "altrep": false,
    "subset_indices": [
      [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    ],
    "dataset": "type_integers",
    "language": "R"
  },
  "info": {
    "arrow_version": "9.0.0-SNAPSHOT",
    "arrow_compiler_id": "AppleClang",
    "arrow_compiler_version": "13.1.6.13160021",
    "benchmark_language_version": "R version 4.1.3 (2022-03-10)",
    "arrow_version_r": "8.0.0.9000"
  },
  "context": {
    "arrow_compiler_flags": " -Qunused-arguments -fcolor-diagnostics -O3 -DNDEBUG",
    "benchmark_language": "R"
  },
  "github": {
    "repository": "https://github.com/apache/arrow",
    "commit": "d9d78946607f36e25e9d812a5cc956bd00ab2bc9"
  },
  "options": {
    "iterations": 1,
    "drop_caches": false,
    "cpu_count": null
  },
  "result": [
    {
      "process": 0.00262000000000001,
      "real": 0.0025826720520854,
      "start_mem_bytes": 178601984,
      "end_mem_bytes": 192430080,
      "max_mem_bytes": 215236608,
      "gc_level0": 0,
      "gc_level1": 0,
      "gc_level2": 0
    }
  ],
  "params": {
    "source": "type_integers",
    "exclude_nulls": false,
    "altrep": false,
    "subset_indices": [
      [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    ],
    "cpu_count": 10,
    "lib_path": "latest",
    "packages": [
      {
        "package": "arrow",
        "version": "8.0.0.9000",
        "date": "2022-05-12",
        "source": "local"
      },
      {
        "package": "arrowbench",
        "version": "0.1.0",
        "date": "2022-05-24",
        "source": "local"
      },
      ... < this gets very long >
    ]
  },
  "output": "### RESULTS HAVE BEEN PARSED ###",
  "rscript": [
    "",
    "library(arrowbench)",
    ...
    "cat(\"\n##### RESULTS END\n\")"
  ]
}

alistaire47 commented 2 years ago

Maybe we should separate the case params from the global params (cpu_count, lib_path, and packages). Doesn't have to happen now and I haven't yet seen what downstream effects would be, but I feel like sooner or later conflating args and state is going to bite us, and now we have lots of metadata describing state anyway.

alistaire47 commented 2 years ago

The only substantive thing I would consider adding is: how hard would it be to make a reader function in {arrowbench} that constructs a results object from one of these JSONs (without being run through run_benchmark())? That would let us experiment with this a bit more easily in {arrowbench} (and be the start of the class for results objects we've talked about)

For a reasonably comprehensive implementation (I'm thinking R6, maybe light validation, adjusting the as.data.frame methods, serialization to/from JSON, and adjusting all the code to use it everywhere) nontrivial, but doable. Maybe 2-3 days? A little more if something's more complicated than I realize.

jonkeane commented 2 years ago

For a reasonably comprehensive implementation (I'm thinking R6, maybe light validation, adjusting the as.data.frame methods, serialization to/from JSON, and adjusting all the code to use it everywhere) nontrivial, but doable. Maybe 2-3 days? A little more if something's more complicated than I realize.

Cool, let's do that as a follow on, though do that relatively soon (excepting time off + the weekend etc. of course!)

voltrondata-labs / arrowbench

Make result JSON self-contained #98