databricks / dbt-databricks

A dbt adapter for Databricks.
https://databricks.com
Apache License 2.0
195 stars 104 forks source link

Support for new serverless compute #653

Open gaoshihang opened 2 months ago

gaoshihang commented 2 months ago

Describe the feature

Databricks just released the new Serverless Compute at May 1st, I think it will help us a lot on the pipeline, to avoid slow spin-up time of All-purpose/job cluster. So, can we add support for this type of Compute?

Describe alternatives you've considered

Additional context

https://docs.databricks.com/en/workflows/jobs/run-serverless-jobs.html

Who will this benefit?

I think all kinds of job can get benefit from this, because we don't need to control the cluster by ourselves.

Are you interested in contributing this feature?

Yes, but don't know how to do.

benc-db commented 2 weeks ago

In progress; it will only work for python models, since much like job clusters, I don't think these nodes have a thrift server.

dbph commented 2 days ago

Hi @benc-db ,

I would like to use Serverless as compute for the DBT-CLI - as this would save around five minutes to spin up the compute for DBT-CLI for each DBT-Job.

Currently trying to do that results in the following error-message: Databricks adapter: Connection(session-id=Unknown) - Exception while trying to create connection: Error during request to server Error properties: attempt=1/30, bounded-retry-delay=None, elapsed-seconds=843.3235409259796/900.0, error-message=, http-code=None, method=OpenSession, no-retry-reason=non-retryable error, original-exception=Retry request would exceed Retry policy max retry duration of 900.0 seconds, query-id=None, session-id=None Will your Pull request also solve this issue?

Thanks! :-)

benc-db commented 2 days ago

@dbph I would file a ticket with your company's Databricks contact, because that scenario is already supposed to be supported, and we'd need investigation from the Jobs team to know why its not in your case.