Closed jangorecki closed 5 years ago
it is worth to note that due to this extreme unbalanced dataset we are having tons of exceptions handling in benchplot functions, simply because there are so few results for 1e9 k=2, just another one today: https://github.com/h2oai/db-benchmark/commit/cf21830c8926ee37696c1589f6aff6379e2449b8
clickhouse mergetree table engine also is being killed on q3 for 1e9 k=2
As agreed with Matt before we can close this one. It is useful to know if/when the issue will be handled. So, lets keep k=2 factor despite it complicates handling all the tools.
As of now the most unbalanced cardinality test (K=2) is failing for most of the tools
It is quite likely that even if pandas and dask would read those data they would fail on question 3 as they doesn't scale well in cardinality.
Unless we will define exceptions in launcher script, on each upgrade of those tools (or "force run") there is an attempt to run those scripts for those tools.
Currently three tools that deals successfully on 1e9 k=2 spent in total 4 hours while three other datasets (k=1e2, k=1e1 and k=1e2 sorted) in total takes 5 for those tools.