databricks / databricks-sdk-py

Databricks SDK for Python (Beta)
https://databricks-sdk-py.readthedocs.io/
Apache License 2.0
367 stars 120 forks source link

[FEATURE] CREATE TABLE API #427

Open ion-elgreco opened 1 year ago

ion-elgreco commented 1 year ago

Problem Statement Over at Delta-RS, we would like to add unity catalogue registration of tables created with deltalake python https://github.com/delta-io/delta-rs. Currently, this does not seem to be possible without a Databricks SQL warehouse executing a SQL command.

There should be a REST API to CREATE A TABLE, with an optional parameter of external location.

https://databricks-sdk-py.readthedocs.io/en/latest/workspace/tables.html

Proposed Solution Add a method on the TablesApi called create.

TablesApi.create(
    table_name = 'catalog',
    schema_name = 'schema',
    catalog_name = 'my_table',
    external_location = 's3://depts/finance/sec_filings''
)

Additional Context Add any other context, references or screenshots about the feature request here.

ion-elgreco commented 1 year ago

There is a terraform command for this: https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/sql_table, it's just needs to be exposed as POST API and then it can be added here.

mgyucht commented 10 months ago

This is a good suggestion. This can't be exposed as a REST API today, but we can do something similar to what we do in the TF provider and implement this functionality in the SDK as a custom method.

ion-elgreco commented 4 months ago

@mgyucht the CreateTable API already works in databricks, I tried it out. Any chance you can get that in the SDK? Or you open to me contributing it?

(https://github.com/unitycatalog/unitycatalog/blob/main/api/Models/CreateTable.md):

import json
import requests

tables_uri = "https://<host>.azuredatabricks.net/api/2.1/unity-catalog/tables"
headers = {"Authorization": f"Bearer {token}", "Content-Type": "application/json"}
body = {
    "name": "my_table", 
    "catalog_name": "my_catalog",
    "schema_name": "my_schema",
    "table_type": "EXTERNAL",
    "data_source_format": "DELTA",
    "storage_location": "abfss://...."
    }

response = requests.post(
    tables_uri, 
    headers=headers, 
    json=body
    )
json.loads(response.content.decode("utf-8"))
dipankarkush-db commented 19 hours ago

Thanks @ion-elgreco . Were you able to create a view. I do not see an option to give view definition query.