databrickslabs / dbx

🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.
https://dbx.readthedocs.io
Other
437 stars 119 forks source link

Deployment of sql_task (file) is not working #860

Open antgei opened 5 months ago

antgei commented 5 months ago

Expected Behavior

Current Behavior

Problem 1

ValidationError: 1 validation error for Deployment workflows -> 0 -> Workflow -> tasks -> 2 -> sql_task -> file -> file field required (type=value_error.missing)

This is wrong according to current API, expects sql_task -> file -> path: https://docs.databricks.com/api/workspace/jobs/create

Problem 2

The scond problem is that "libraries" including the .whl are appended to all tasks, this is not allowed for sql_task

{
            "sql_task": {
               ...
            "libraries": [
                {
                    "whl": "dbfs:/dbx/dwh-databricks-jobs/0d8888c4345e41a99e9c370c60232ee9/artifacts/dist/dwh_core-0.0.1-py3-none-any.whl"
                }
            ]
        }
SQL Task does not support dependent libraries. Remove the '
             'dependent libraries and retry again.'

Steps to Reproduce (for bugs)

dbx deploy ...

...
{
    "task_key": "create_view",
    "sql_task": {
      "file": {
        "path": "/Repos/.......create_view.sql",
        "source": "WORKSPACE"
      },
      "warehouse_id": "<ID>"
}

Context

Your Environment

antgei commented 5 months ago

Suggest for patch to make it work:

diff --git a/dbx/api/adjuster/adjuster.py b/dbx/api/adjuster/adjuster.py
index 4ae4301..8ad5f2c 100644
--- a/dbx/api/adjuster/adjuster.py
+++ b/dbx/api/adjuster/adjuster.py
@@ -85,7 +85,8 @@ class PropertyAdjuster(
         for element, _, __ in self.traverse(workflows):

             if isinstance(element, (V2dot0Workflow, JobTaskSettings)):
-                self._preprocess_libraries(element, additional_libraries)
+                if element.sql_task is None:  # SQL Task does not support dependent libraries
+                    self._preprocess_libraries(element, additional_libraries)

     def _new_cluster_handler(self, element: NewCluster):
         # driver_instance_pool_name -> driver_instance_pool_id
diff --git a/dbx/models/workflow/v2dot1/task.py b/dbx/models/workflow/v2dot1/task.py
index e063bb4..0ba09c8 100644
--- a/dbx/models/workflow/v2dot1/task.py
+++ b/dbx/models/workflow/v2dot1/task.py
@@ -42,7 +42,8 @@ class SqlTaskAlert(FlexibleModel):

 class SqlFile(FlexibleModel):
-    file: str
+    path: str
+    source: str

 class SqlTask(FlexibleModel):
jonfoxchase commented 4 months ago

Is there any update on this? facing same issue