raunakab commented 3 weeks ago

Overview

Create new GHA workflow for building a commit and running tpch against it.

Notes

There are 2 main workflows:

build-commit.yaml
run-tpch.yaml

The final workflow, build-commit-run-tpch.yaml just runs the above two in a sequential order.

I've also made some changes to benchmarking/tpch/__main__.py. Namely:

Added all env-vars that start with DAFT to the ray-runtime-env variables that's sent during ray-cluster initialization.
Added flag to turn off sending daft module to ray-cluster during initialization.
- No need to pickle the daft module and send it over; it's already installed on the ray-cluster from the AWS S3 link pointing to the prebuilt python-wheel.

I've summarized the workflows individually down below:

build-commit workflow

uses buildjet for building
caching enabled
builds a release python-wheel and stores it in AWS S3
without caching, builds take around 6-7min
with caching, builds take roughly 3min
with aggressive caching, builds take <40s

run-tpch workflow

pulls the AWS S3 python-wheel and runs it
runs the benchmarking.tpch benchmark
produces an output.csv and sends it back to GHA to be displayed
renders the output directly in the Summary Page
grabs Ray-Logs and uploads that to the GHA Summary Page as well

build-commit-run-tpch workflow

literally just invokes the first two in sequential order
maps the built python-wheel of the build-commit job (1st one) to the input of the run-tpch job (2nd one)

codspeed-hq[bot] commented 3 weeks ago

	Benchmark	`main`	`feat/infra`	Change
❌	`test_iter_rows_first_row[100 Small Files]`	264.6 ms	404.6 ms	-34.59%

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 76.37%. Comparing base (ec39dc0) to head (e908742).

Additional details and impacted files

[![Impacted file tree graph](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3184/graphs/tree.svg?width=650&height=150&src=pr&token=J430QVFE89&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc)](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3184?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) ```diff @@ Coverage Diff @@ ## main #3184 +/- ## ========================================== - Coverage 76.54% 76.37% -0.17% ========================================== Files 685 685 Lines 85269 85135 -134 ========================================== - Hits 65266 65020 -246 - Misses 20003 20115 +112 ``` [see 35 files with indirect coverage changes](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3184/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc)

🚨 Try these New Features:

Flaky Tests Detection - Detect and resolve failed and flaky tests

raunakab commented 1 week ago

Example: https://github.com/Eventual-Inc/Daft/actions/runs/11926775586

Run by @desmondcheongzx. Run was submitted locally using the gh CLI tool. Invocation was:

gh workflow run build-commit-run-tpch.yaml --ref $BRANCH_NAME -f skip_questions=$SKIP_QUESTIONS

raunakab commented 1 week ago

Tagging @colin-ho. You recently touched the benchmarking/tpch/__main__.py file. Just wanted to run some of those changes by you first.

Eventual-Inc / Daft

[FEAT] GHA workflow to perform tcph benchmarking #3184

Overview

Notes

build-commit workflow

run-tpch workflow

build-commit-run-tpch workflow

CodSpeed Performance Report

Merging #3184 will degrade performances by 34.59%

Summary

Benchmarks breakdown

Codecov Report