Closed bkamins closed 3 years ago
Both yes, but unfortunately it can take some time. I don't have a workstation for a couple weeks and cannot refresh dbb environment.
Sure - this can wait. I am also curious what the results will be.
Additionally - is there an instruction somewhere how to reproduce the test datasets? When I simply tried to run https://github.com/h2oai/db-benchmark/blob/master/_data/join-datagen.R I get:
Error in sample.int(length(x), size, replace, prob) :
invalid first argument
Calls: data.table -> sample_all -> sample -> sample -> sample.int
Execution halted
I am on data.table v1.13.0
I close it for now as I managed to run the tests locally. I will try to make code run fast on the original cases.
@jangorecki when looking at the benchmarks for
join
I have noticed that for different packages sometimes a different order of tables passed is used. In DataFrames.jl currently this order actually matters (when joining the "small" table should go first).So my questions are:
Thank you!