databricks / databricks-sdk-py

Databricks SDK for Python (Beta)
https://databricks-sdk-py.readthedocs.io/
Apache License 2.0
318 stars 103 forks source link

Add `serverless_compute_id` field to the config #675

Closed alexkh-db closed 3 weeks ago

alexkh-db commented 4 weeks ago

Changes

We propose adding a new serverless_compute_id attribute in the config that will be used to enable serverless compute in the downstream applications. The default value to enable serverless is serverless_compute_id = auto, other values might be used in the future if needed. Config doesn't have to validate the value in serverless_compute_id, this will be done in the downstream applications.

Tests

codecov-commenter commented 4 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 57.69%. Comparing base (f50af60) to head (d135f04).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #675 +/- ## ======================================= Coverage 57.69% 57.69% ======================================= Files 48 48 Lines 33079 33080 +1 ======================================= + Hits 19084 19085 +1 Misses 13995 13995 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

alexkh-db commented 4 weeks ago

LGTM. Docs?

I couldn't find any docs for the other related fields, e.g. cluster_id, so i think we are not documenting it (at least yet). @mgyucht please lmk if documentation is needed

grundprinzip commented 3 weeks ago

Config doesn't have to validate the value in serverless_gc_id, this will be done in the downstream applications.

Isn't the only allowed value at the moment "auto" then? If this is the case, shouldn't it have validation?

alexkh-db commented 3 weeks ago

Config doesn't have to validate the value in serverless_gc_id, this will be done in the downstream applications.

Isn't the only allowed value at the moment "auto" then? If this is the case, shouldn't it have validation?

i'm open to both options, the pros for not validating in config IMO are: 1) config is doing little validation for other fields and mostly leaves validation to the downstream applications. not sure if we should single out serverless_gc_id field 2) if dbconnect or any other downstream applications will need to add another value except auto, databricks-sdk code doesn't need to change and the user has one less package to upgrade

grundprinzip commented 3 weeks ago

Sorry for being horrible annoying. According to https://docs.databricks.com/en/compute/serverless.html I would just name the field serverless_compute_id.