duckdblabs / db-benchmark

reproducible benchmark of database-like ops
https://duckdblabs.github.io/db-benchmark/
Mozilla Public License 2.0
146 stars 29 forks source link

Update ClickHouse scripts and update results #67

Closed qoega closed 10 months ago

qoega commented 10 months ago

Initial ClickHouse support was added to h2o-benchmark long time ago when some features could be not present in ClickHouse. Is was less maintainable so complex scripts were rewritten to a simpler templated way.

To make comparison close to the other systems

Although we specifically do not perform optimisations that require tuning settings to get better results for specific workload as this one:

https://github.com/duckdblabs/db-benchmark/blob/49016238fdd78dcbc4e5119bb0cd34487ca5419c/duckdb-latest/join-duckdb-latest.R#L63-L66 Notes:

jangorecki commented 10 months ago

Although we specifically do not perform optimisations that require tuning settings to get better results for specific workload as this one:

https://github.com/duckdblabs/db-benchmark/blob/master/duckdb-latest/join-duckdb-latest.R#L63-L66

IMO that should not be allowed. The tools should detect that automatically and provide optimization itself, not needing a user to tune config for particular data.

qoega commented 10 months ago

@Tmonster Can you review this PR?

Tmonster commented 10 months ago

hi @qoega ,

I took a look at this today and ran the benchmark myself to make sure the results were accurate. First off, there are still some errors reported when running the script as is. clickhouse/exec.sh lines 138-140 attempt to drop table present in the join benchmark even though the group by benchmark is being run. This error causes the rest of the benchmark to behave erratically.

Also, it seems many of commands are piped to stderr, which muddies the .err file generated. This is also not handy when running the benchmarks as my immediate thought is clickhouse did not complete. Could you pipe this to the .out file instead?

I ran the benchmark on clickhouse version 23.10.5.20 and was seeing slower results than reported in this PR. Specifically a 10 second difference for basic group by and 20 seconds for advanced group by when looking at the summaries. I will try again sometime next week, as I imagine it may have to do with a combination of upgrading clickhouse while there is a different configuration file. I can send/email you the results I generated if needed.

Do you think this might have to do with the slightly different versions? I'm usually under the impression minor versions only address bug fixes and do not have performance changes, so I'm hesitant to blame it on that.

I'm going to work on including clickhouse in the regression tests in the meantime

qoega commented 10 months ago

clickhouse/exec.sh lines 138-140 attempt to drop table present in the join benchmark

I'll check this one. I think some copy-paste garbage left. Originally right join tables were not cleared at all and I added drop logic there.

Also, it seems many of commands are piped to stderr, which muddies the .err file generated

I think it always was that way, so I did not add any other logic. I will check what can be done and why it is in cerr.

Specifically a 10 second difference for basic group by and 20 seconds for advanced group by when looking at the summaries.

I can see images where 50GB basic summary result is 36s and advanced 96s. So 10-20 seconds looks significant. I will run again on a clean setup(as I had previous vm stopped).

Can you check that your storage setup is correct and ClickHouse runs with nvme drive for /var/lib/clickhouse (or the other data path if you changed it)? Minor versions should not change performance. So storage configuration is the easiest thing to check.

qoega commented 10 months ago

I checked what was wrong with .err - I just forgot to remove set -x that I added to debug scripts.

I ran locally again on a clean system, this is how basic groupby 50GB looks

Image ![image](https://github.com/duckdblabs/db-benchmark/assets/2159081/524d93f1-6178-48be-9583-715f51243770)
Tmonster commented 10 months ago

Thanks! I'll take a look. Should I try to use v23.10.4.25 or do you think v23.10.5.20 will have the same result?

qoega commented 10 months ago

I should be the same. I just installed latest deb package and it was updated recently. (Changelog is small and about bug fixes here)

qoega commented 10 months ago

Do you have any update?

Tmonster commented 10 months ago

Sorry for the delay. Working on it today!

Tmonster commented 10 months ago

I added clickhouse to the regression script as well, which is one of the causes for the conflict. Merging dask results is the other reason. If you merge with master again that would be much appreciated

Tmonster commented 10 months ago

Looks good! Was able to verify the results finally image

Will merge when CI finishes

Tmonster commented 10 months ago

Thanks!

Tmonster commented 10 months ago

results are also up