voltrondata-labs / arrowbench

R package for benchmarking
Other
13 stars 9 forks source link

Insert benchmark defaults for params in `run_one()` #129

Open alistaire47 opened 1 year ago

alistaire47 commented 1 year ago

Currently run_one() does not call get_default_parameters(), so while benchmark parameter defaults are used when running, if not specified they will not show up in tags as they should.

One way to deal with this would be to call get_default_parameters() on the params and ensure the result has exactly one row. (Note we can't do this right now, because voltrondata-labs/benchmarks is sending cpu_count = NULL to run_one(), which run_one() is currently interpreting as "do nothing about CPU count in the script", whereas get_default_params() will turn that into c(1L, parallel::detectCores()). For Arrow in particular, arrow:::GetCpuThreadPoolCapacity() appears equal to parallel::detectCores(), but changing the way we set it may set the cpu_count tag and break histories. We probably should, but will need to go clean them up.)

Alternately we could validate based on formals(bm$setup) and insert or error if any are not specified, but maintaining an alternate version of get_default_params() would increase the complexity and maintenance burden.