h2oai / db-benchmark

reproducible benchmark of database-like ops
https://h2oai.github.io/db-benchmark
Mozilla Public License 2.0
325 stars 88 forks source link

allow solutions to load data on demand for joining task #234

Open monopolynomial opened 2 years ago

monopolynomial commented 2 years ago

This is only a suggestion, however, I am not sure if it fits to your current goal. I think it is a good suggestion because at least for huge data it can be more informative - for instance it can help to understand how the solutions perform on joining large data and tiny data (50G case)

jangorecki commented 2 years ago

Could you explain more your suggestion? Currently RHS table is of 3 different sizes to test joining vs small, medium and big (same size as LHS). What do you mean by on demand?

monopolynomial commented 2 years ago

i.e.the big data file isn't needed until the last task, so letting solutions to load it by that time can help em to finish the smaller task.