Report step does not run

maartenbreddels commented 4 years ago

I'm trying to see if I can make vaex run your benchmark suite. As a start, I tried to run just the pandas benchmark, but I have trouble running the report:

$ Rscript -e 'rmarkdown::render("./_report/index.Rmd", output_dir="public")'
  |...................                                                   |  27%
label: init
Quitting from lines 24-51 (index.Rmd) 
Error in sum(int, dbl) : invalid 'type' (list) of argument
Calls: <Anonymous> ... model_time -> nrow -> [ -> [.data.table -> approxUniqueN1

Execution halted

Not knowing much about R, maybe you can help me with this error message.

jangorecki commented 4 years ago

Hello,

Does the pandas benchmark scripts run successfully? Is there time.csv and have expected entries populated by the benchmark script?

The code chunk you pasted is related to producing report page that is on the website. It is not related to benchmark scripts itself, but only to presenting timings produced by the benchmark. I am pretty sure it will run into troubles if time.csv has entries only for a single solution.

The easiest way to look at results of your benchmark will be to look at time.csv. You can find description about some of the fields in _docs/maintenance.md#reading-csv-logs-and-timings document.

If what you need is to render the report, then easiest way to do it should be:

grab time.csv and logs.csv from h2o report website
tweak nodename field in your csv to match those from h2o
append your time.csv and logs.csv into those from h2o
try rendering report

Feel free to reach my out if you need help with any of these.

maartenbreddels commented 4 years ago

Hi Jan,

thanks for your quick reaction. I tried running the script with the downloaded csv, and that works! Thanks. I just want to be sure everything runs fine before I make a PR to this repo. Should I use https://github.com/h2oai/db-benchmark/pull/8 as a template?

cheers,

Maarten

jangorecki commented 4 years ago

Better to copy recent scripts instead. For example we not do not use pydatatable/pydatatable.sh as a launcher anymore. Best to copy recent pandas groupby and join script and work on those.

jangorecki commented 4 years ago

@maartenbreddels I am closing this issue as it seems to be answered.

maartenbreddels commented 4 years ago

Absolutely, thanks!

h2oai / db-benchmark

Report step does not run #150