Closed alistaire47 closed 2 years ago
Maybe we should separate the case params from the global params (cpu_count
, lib_path
, and packages
). Doesn't have to happen now and I haven't yet seen what downstream effects would be, but I feel like sooner or later conflating args and state is going to bite us, and now we have lots of metadata describing state anyway.
The only substantive thing I would consider adding is: how hard would it be to make a reader function in {arrowbench} that constructs a results object from one of these JSONs (without being run through
run_benchmark()
)? That would let us experiment with this a bit more easily in {arrowbench} (and be the start of the class for results objects we've talked about)
For a reasonably comprehensive implementation (I'm thinking R6, maybe light validation, adjusting the as.data.frame
methods, serialization to/from JSON, and adjusting all the code to use it everywhere) nontrivial, but doable. Maybe 2-3 days? A little more if something's more complicated than I realize.
For a reasonably comprehensive implementation (I'm thinking R6, maybe light validation, adjusting the as.data.frame methods, serialization to/from JSON, and adjusting all the code to use it everywhere) nontrivial, but doable. Maybe 2-3 days? A little more if something's more complicated than I realize.
Cool, let's do that as a follow on, though do that relatively soon (excepting time off + the weekend etc. of course!)
Closes #97. An effort to add metadata to the JSON generated by {arrowbench} such that it contains everything required to pass to
conbench.record()
(like happens in {ursacomputing/benchmarks} right now) such that a directory of results can be handled without maintaining metadata elsewhere, a prerequisite for switching from usingrun_one()
torun_benchmark()
in {ursacomputing/benchmarks}.Notes:
auto_unbox = TRUE, null = 'null'
totoJSON()
options because{}
is notNULL
and we don't want lists of length 1 everywhere. This doesn't seem to break anything.drop_caches = FALSE
for now, as that's the default in {benchmarks} and we're not overriding it anywhere AFAIK. We should probably clear them from R and parameterize this, but that felt beyond the scope of this story.extra_tags
in {benchmarks} because I don't see them used anywhere.output
(stdout
for the call), though I think this gets handled differently depending on where it gets called. We'll likely need to revisit how we want it to work.benchmarks.Conbench.record()
,conbench.record()
, and the conbench POST API want are all slightly different. There is also duplication in here; notably params for the case are both inparams
andtags
. A pydantic schema we use everywhere (and recreate in R6?) would be nice.What the JSON now looks like (lightly abridged):