timescale / tsbs

Time Series Benchmark Suite, a tool for comparing and evaluating databases for time series data
MIT License
1.29k stars 300 forks source link

Add query timeout via --max-total-duration #149

Closed mdcallag closed 3 years ago

mdcallag commented 3 years ago

This adds a timeout for query tests via the --max-total-duration option which is the max duration for all queries. Queries won't be started once this is exceeded but the currently running query is allowed to finish.

mdcallag commented 3 years ago

blagojts has context on this. I am a golang noob.

jonatas commented 3 years ago

Hello @mdcallag! Thanks for the PR!

The main objective of tsbs is measuring time, so if we limit the time, we'll never know the truth right?

What is the main reason for adding this param? Would you mind sharing a bit of your context?

mdcallag commented 3 years ago

As background, I evaluated the implementation for MongoDB and Postgres and then added a MySQL port. Without naming the DBMS, there were a few queries that ran forever (hours) because of either a bad query plan or a non-performant query (rewriting the query fixes it).

Without the timeout there were 1+ queries that ran for hours while other queries ran for seconds or minutes. Such outliers make perf testing a bad experience -- wasting HW and time.

This provides an optional SLA, and the use of SLAs matches what happens in production (user goes away after waiting too long) and benchmarks (TPC-C has response time requirements).

FWIW, one of the MongoDB queries is broken, but fortunately broken in a way that makes it faster.

I don't mind if you ignore this. I am less interested in TSBS than I used to be. I am not a fan of load, then query workloads, especially when the workload being modeled has queries concurrent with writes.

jonatas commented 3 years ago

@mdcallag, thanks for the details. That is very interesting, and probably we should find a way to report or flag the timing out queries because this is unexpected behavior.

I don't mind if you ignore this. I am less interested in TSBS than I used to be.

I'm just trying to help with the project. Learning more of go with reviews while also using learning more details about the internals of the tool :)

I am not a fan of load, then query workloads, especially when the workload being modeled has queries concurrent with writes.

I'm still not that deep in the architecture. Suggestions for improving the workload architecture to avoid these queries concurrent with writes to a more isolated model are very welcome. I'm only using tsbs load with the simulator option to generate and send data simultaneously, while for queries, I'm always doing it in two steps; first, I run a command to generate the queries and another to execute the queries and collect the metrics.

mdcallag commented 3 years ago

I appreciate your responses, but it took ~5 months to get a reply on my PR and AFAIK the code has changed a lot since then making this, or my other PRs, unlikely to merge cleanly. This isn't a great experience for an external contributor and I have moved on to other projects. I won't spend more time on these PRs.

jonatas commented 3 years ago

@mdcallag I know how it feels and I know it's disappointing. Several times I just go and close my PRs when they get old and no traction. Thanks for your sincere feedback.

We're discussing how to better manage the community contributions and one of my goals now is to try to review and update all open PRs or close them to allow us to keep more up to date with the external contributions.

I hope things get better here in a near future!