dbt-labs / dbt-bigquery

dbt-bigquery contains all of the code required to make dbt operate on a BigQuery database.
https://github.com/dbt-labs/dbt-bigquery
Apache License 2.0
199 stars 137 forks source link

ADAP-25 Run python model test based on schedule #306

Open ChenyuLInx opened 1 year ago

ChenyuLInx commented 1 year ago

Currently when running our python model tests with GHA, we run multiple tests at the same time. With Dataproc(Cluster or serverless), tests would fail due to underlying infra is overloaded. See this as an example.

We should skip python model tests in normal workflows, but create a scheduled run to run python model tests everyday so that we can still catch regression

When turning the test on, we also need to include the later added PySpark Dataframe Test(https://github.com/dbt-labs/dbt-core/pull/5906)

ChenyuLInx commented 1 year ago

More color on the test fail due to Dataproc not scalable enough. There are two ways to submit dataproc job: Cluster vs Serverless, for Cluster we run a always on cluster, for serverless GCP would spin up a short lived server to just run one job.