databricks / cli

Databricks CLI
Other
148 stars 56 forks source link

Add JobTaskClusterSpec validate mutator #1784

Closed kanterov closed 1 month ago

kanterov commented 1 month ago

Changes

Add JobTaskClusterSpec validate mutator. It catches the case when tasks don't which cluster to use.

For example, we can get this error with minor modifications to default-python template:

      tasks:
        - task_key: python_file_task
          spark_python_task:
            python_file: ../src/my_project_10/main.py
 % databricks bundle validate
Error: Missing required cluster or environment settings
  at resources.jobs.my_project_10_job.tasks[0]
  in resources/my_project_10_job.yml:17:11

Task "print_github_stars" requires a cluster or an environment to run.
Specify one of the following fields: job_cluster_key, environment_key, existing_cluster_id, new_cluster.

We implicitly rely on "one of" validation, which does not exist. Many bundle fields can't co-exist, for instance, specifying: JobTask.{existing_cluster_id,job_cluster_key}, Library.{whl,pypi}, JobTask.{notebook_task,python_wheel_task}, etc.

Tests

Unit tests

kanterov commented 1 month ago

@pietern I've updated error/warning messages and added more comments as you suggested. I assume that the convert.ToTyped comment is optional because, at this point, the module is small and logic is encapsulated. I think it makes a lot of sense to have shared visitors, for instance, extract code that visits all tasks that need cluster/environment.

pietern commented 1 month ago

Yes, it's not strictly necessary.

A higher-level comment about the error messages is that the summary is what users will read first, and that's what should contain what is wrong, not the solution. Right now they are swapped. The solution is in the summary and the error is in the detail. Could you swap those around? Then it will say, Error: task "main_task" has a task type that requires a cluster or environment, but neither is specified with the solution following it. WDYT?

kanterov commented 1 month ago

@pietern Thanks. I've updated the doc. Can you please add it to the merge queue?