duckdblabs / db-benchmark

reproducible benchmark of database-like ops
https://duckdblabs.github.io/db-benchmark/
Mozilla Public License 2.0
136 stars 27 forks source link

Fix collapse groupby q8 #54

Closed SebKrantz closed 1 month ago

Tmonster commented 8 months ago

Thanks! Since this only enables one query, I can merge it now as is. The only thing is that I don't know when the report will be updated. I'm trying to move away from running all the benchmarks myself since it can require a lot of overhead.

If you want to the benchmarks for q8 by yourself on a c6id.metal machine and add the results, I will happily merge and update the report. If that's too much overhead, I would recommend waiting until another version of collapse comes out and then re-running the whole benchmark on the new version and reporting the updated results. I would also happily merge that 👍

SebKrantz commented 8 months ago

Thanks @Tmonster. I don't have a c6id.metal machine, only an M1 Mac (which btw. I think would be a much more interesting machine for the benchmark as it is widely known and used, and faster than most servers, but of course not able to do 50Gb). I think this will have to wait then until the bachmarks are re-run. I am publishing minor versions (at collapse 2.0.6 now), and in general I don't think there will be large improvements in performance in the near future, so it would be good to at some point see collapse in the rankings for advanced operations.

Tmonster commented 8 months ago

c6id.metal machine is just an AWS machine that can be rented by anyone with access to aws. I don't know if I will be re-running the benchmarks much myself anymore, since the new PR process allows others to run it themselves, PR the results, and then I can publish a new report in 5 minutes instead of how many hours it takes to run the benchmark.

I agree that a m1 mac would be better. We chose a c6id.metal machine because of the reasons mentioned in the blog post. I just discovered that I can also scale down a c6id.metal machine to run a c6id.4xlarge (16 cores, 32 GB memory) with only one tenant. This setup will achieve the same environment, a smaller personal laptop size environment on one machine.

In the future I would like to report results on a machine similar to a M1, but since I just changed to the c6id.metal. I want to wait a bit before switching the machine type again.