voltrondata-labs / arrowbench

R package for benchmarking
Other
13 stars 9 forks source link

[MINOR]: add get_params_summary #83

Closed boshek closed 2 years ago

boshek commented 2 years ago

Working through #54 I found it useful to have a data.frame of the parameters and the success of a run. I resisted the urge to just to make a summary method as that seemed like overkill but I am happy to do that as well. This works like this:

library(arrowbench)
nyc <- run_benchmark(write_file, source = "nyctaxi_2010-01")
#> Running 20 benchmarks with 1 iterations:
#>             source  format  compression       input cpu_count
#> 1  nyctaxi_2010-01 parquet uncompressed arrow_table         1
#> 2  nyctaxi_2010-01 feather uncompressed arrow_table         1
#> 3  nyctaxi_2010-01 parquet       snappy arrow_table         1
#> 5  nyctaxi_2010-01 parquet          lz4 arrow_table         1
#> 6  nyctaxi_2010-01 feather          lz4 arrow_table         1
#> 7  nyctaxi_2010-01 parquet uncompressed  data_frame         1
#> 8  nyctaxi_2010-01 feather uncompressed  data_frame         1
#> 9  nyctaxi_2010-01 parquet       snappy  data_frame         1
#> 11 nyctaxi_2010-01 parquet          lz4  data_frame         1
#> 12 nyctaxi_2010-01 feather          lz4  data_frame         1
#> 13 nyctaxi_2010-01 parquet uncompressed arrow_table        10
#> 14 nyctaxi_2010-01 feather uncompressed arrow_table        10
#> 15 nyctaxi_2010-01 parquet       snappy arrow_table        10
#> 17 nyctaxi_2010-01 parquet          lz4 arrow_table        10
#> 18 nyctaxi_2010-01 feather          lz4 arrow_table        10
#> 19 nyctaxi_2010-01 parquet uncompressed  data_frame        10
#> 20 nyctaxi_2010-01 feather uncompressed  data_frame        10
#> 21 nyctaxi_2010-01 parquet       snappy  data_frame        10
#> 23 nyctaxi_2010-01 parquet          lz4  data_frame        10
#> 24 nyctaxi_2010-01 feather          lz4  data_frame        10
#> Running source=nyctaxi_2010-01 format=parquet compression=uncompressed input=arrow_table cpu_count=1 name=write_file
#> Running source=nyctaxi_2010-01 format=feather compression=uncompressed input=arrow_table cpu_count=1 name=write_file
#> Running source=nyctaxi_2010-01 format=parquet compression=snappy input=arrow_table cpu_count=1 name=write_file
#> Running source=nyctaxi_2010-01 format=parquet compression=lz4 input=arrow_table cpu_count=1 name=write_file
#> Running source=nyctaxi_2010-01 format=feather compression=lz4 input=arrow_table cpu_count=1 name=write_file
#> Running source=nyctaxi_2010-01 format=parquet compression=uncompressed input=data_frame cpu_count=1 name=write_file
#> Running source=nyctaxi_2010-01 format=feather compression=uncompressed input=data_frame cpu_count=1 name=write_file
#> Running source=nyctaxi_2010-01 format=parquet compression=snappy input=data_frame cpu_count=1 name=write_file
#> Running source=nyctaxi_2010-01 format=parquet compression=lz4 input=data_frame cpu_count=1 name=write_file
#> Running source=nyctaxi_2010-01 format=feather compression=lz4 input=data_frame cpu_count=1 name=write_file
#> Running source=nyctaxi_2010-01 format=parquet compression=uncompressed input=arrow_table cpu_count=10 name=write_file
#> Running source=nyctaxi_2010-01 format=feather compression=uncompressed input=arrow_table cpu_count=10 name=write_file
#> Running source=nyctaxi_2010-01 format=parquet compression=snappy input=arrow_table cpu_count=10 name=write_file
#> Running source=nyctaxi_2010-01 format=parquet compression=lz4 input=arrow_table cpu_count=10 name=write_file
#> Running source=nyctaxi_2010-01 format=feather compression=lz4 input=arrow_table cpu_count=10 name=write_file
#> Running source=nyctaxi_2010-01 format=parquet compression=uncompressed input=data_frame cpu_count=10 name=write_file
#> Running source=nyctaxi_2010-01 format=feather compression=uncompressed input=data_frame cpu_count=10 name=write_file
#> Running source=nyctaxi_2010-01 format=parquet compression=snappy input=data_frame cpu_count=10 name=write_file
#> Running source=nyctaxi_2010-01 format=parquet compression=lz4 input=data_frame cpu_count=10 name=write_file
#> Running source=nyctaxi_2010-01 format=feather compression=lz4 input=data_frame cpu_count=10 name=write_file
#> Total run time: 4.33247 mins
get_params_summary(nyc)
#>             source  format  compression       input cpu_count lib_path
#> 1  nyctaxi_2010-01 parquet uncompressed arrow_table         1   latest
#> 2  nyctaxi_2010-01 feather uncompressed arrow_table         1   latest
#> 3  nyctaxi_2010-01 parquet       snappy arrow_table         1   latest
#> 4  nyctaxi_2010-01 parquet          lz4 arrow_table         1   latest
#> 5  nyctaxi_2010-01 feather          lz4 arrow_table         1   latest
#> 6  nyctaxi_2010-01 parquet uncompressed  data_frame         1   latest
#> 7  nyctaxi_2010-01 feather uncompressed  data_frame         1   latest
#> 8  nyctaxi_2010-01 parquet       snappy  data_frame         1   latest
#> 9  nyctaxi_2010-01 parquet          lz4  data_frame         1   latest
#> 10 nyctaxi_2010-01 feather          lz4  data_frame         1   latest
#> 11 nyctaxi_2010-01 parquet uncompressed arrow_table        10   latest
#> 12 nyctaxi_2010-01 feather uncompressed arrow_table        10   latest
#> 13 nyctaxi_2010-01 parquet       snappy arrow_table        10   latest
#> 14 nyctaxi_2010-01 parquet          lz4 arrow_table        10   latest
#> 15 nyctaxi_2010-01 feather          lz4 arrow_table        10   latest
#> 16 nyctaxi_2010-01 parquet uncompressed  data_frame        10   latest
#> 17 nyctaxi_2010-01 feather uncompressed  data_frame        10   latest
#> 18 nyctaxi_2010-01 parquet       snappy  data_frame        10   latest
#> 19 nyctaxi_2010-01 parquet          lz4  data_frame        10   latest
#> 20 nyctaxi_2010-01 feather          lz4  data_frame        10   latest
#>    did_error
#> 1      FALSE
#> 2      FALSE
#> 3      FALSE
#> 4      FALSE
#> 5      FALSE
#> 6      FALSE
#> 7      FALSE
#> 8      FALSE
#> 9      FALSE
#> 10     FALSE
#> 11     FALSE
#> 12     FALSE
#> 13     FALSE
#> 14     FALSE
#> 15     FALSE
#> 16     FALSE
#> 17     FALSE
#> 18     FALSE
#> 19     FALSE
#> 20     FALSE
boshek commented 2 years ago

double check that it is pulling in the errors like we want it to

The way it does now, it makes sense to me strictly from a "did it error perspective"

jonkeane commented 2 years ago

😅 release took a while since I imagine RSPM isn't yet up to date!