GlareDB / glaredb

GlareDB: An analytics DBMS for distributed data
https://glaredb.com
GNU Affero General Public License v3.0
550 stars 36 forks source link

chore: Add github action to run benchmarks #1846

Closed vrongmeal closed 7 months ago

vrongmeal commented 8 months ago

Runs the benchmarks with different scale factors (currently only set to 1) 3 times.

Fixes #1822

vrongmeal commented 8 months ago

Here are a few TODOs:

vrongmeal commented 8 months ago

Save/Update each benchmark run

could you say more about this? is this the "data collection" aspect?

Yeah. It's regarding data collection. We have a timings.csv generated here. Would want to at-least save it somewhere.

vrongmeal commented 8 months ago

We have a working CI, with the following issues:

  1. DuckDB query 13 errors:

    Traceback (most recent call last):
    File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
    File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
    File "/home/runner/work/glaredb/glaredb/benchmarks/tpch/duckdb_queries/q13.py", line 42, in <module>
    q()
    File "/home/runner/work/glaredb/glaredb/benchmarks/tpch/duckdb_queries/q13.py", line 33, in q
    utils.get_customer_ds()
    File "/home/runner/work/glaredb/glaredb/benchmarks/tpch/duckdb_queries/utils.py", line 44, in get_customer_ds
    return _scan_ds(join(base_dir, "customer"))
    File "/home/runner/work/glaredb/glaredb/benchmarks/tpch/duckdb_queries/utils.py", line 31, in _scan_ds
    duckdb.sql(q)
    duckdb.CatalogException: Catalog Error: Table with name "_home_runner_work_glaredb_glaredb_benchmarks_tpch_tables_scale_1_customer_parquet" already exists!
  2. We want to save the timings.csv. I am thinking of opening a new issue for saving this data on GlareDB in a format that we can analyze easily, i.e., saving average of all the runs in one CI job with current timestamp and commit hash, something like:

    timestamp,commit,system,query,time
tychoish commented 8 months ago

timestamp,commit,system,query,time

tychoish commented 8 months ago

GlareDB CI / Benchmarks (Scale Factor = ${{ matrix.scale_factor }}) (pull_request) Skipped

seems wrong

vrongmeal commented 8 months ago

GlareDB CI / Benchmarks (Scale Factor = ${{ matrix.scale_factor }}) (pull_request) Skipped

seems wrong

Yeah, this is because of the if statement in the job. I'll move these benchmarks to another workflow file (the one with image builds since that's when we run these.

commit changes on PRs, and might be confusing, might need more metadata, and also some information about which system's being used.

Yeah, commit would work for merges on main but we'll need more information going ahead.

time as observed where? I'd label time as duration (and decide on units that make sense.)

Agreed.

parquet isn't a terrible format for this kind of thing.

Yup, just wanted to mention the schema for the sake of comment so did it in CSV format.

tychoish commented 8 months ago

Yeah, commit would work for merges on main but we'll need more information going ahead.

yeah. I think we need (actions_build_id, execution_number, commit) (or maybe not commit at that point) either we just normalize it and be ok with it (parquet should compress this well, I hope), or we can have a file/table/etc. that has all the metadata, and that we link to the data collected in a run.

I imagine also that we'll end up wanting to collect something more about the variables under test (currently system) as in: scale_factor, num_cores, repeat_id, db_under_test (e.g. for the other engines?)

(and just to be clear we should collect timestamps when the test finishes running, say,)

Yup, just wanted to mention the schema for the sake of comment so did it in CSV format.

Totally makes sense, just wanted to be sure 😅