databricks / cli

Databricks CLI
Other
136 stars 53 forks source link

JSON Job Generated by CLI not accepted for a run only once (submit) #1039

Closed RaccoonForever closed 10 months ago

RaccoonForever commented 10 months ago

workflow.json

Describe the issue

I'm currently trying to execute a run only once job using the CLI: databricks jobs submit --json XXXX

It gives me the error "Error: One of job_cluster_key, new_cluster, or existing_cluster_id must be specified.": image

Something to note is that if I change the cluster of my job, to be an existing interactive cluster, it works. That's why I guess it might be a problem.

I thought it was linked to: https://github.com/databricks/cli/issues/992 but since everything is quoted in spark_conf.

Edit: added the JSON file

Steps to reproduce the behavior

How did I generate the JSON: ./databricks jobs get JOB_ID -o json | jq .settings > workflow.json

My JSON file:

{
  "email_notifications": {
    "no_alert_for_skipped_runs": false
  },
  "format": "MULTI_TASK",
  "job_clusters": [
    {
      "job_cluster_key": "Job_cluster",
      "new_cluster": {
        "azure_attributes": {
          "availability": "ON_DEMAND_AZURE",
          "first_on_demand": 1,
          "spot_bid_max_price": -1
        },
        "custom_tags": {
          "MonitoringTag": "Tests CICD",
          "ResourceClass": "SingleNode"
        },
        "data_security_mode": "SINGLE_USER",
        "enable_elastic_disk": true,
        "node_type_id": "Standard_DS3_v2",
        "num_workers": 0,
        "runtime_engine": "PHOTON",
        "spark_conf": {
          "spark.databricks.cluster.profile": "singleNode",
          "spark.master": "local[*, 4]"
        },
        "spark_env_vars": {
          "PYSPARK_PYTHON": "/databricks/python3/bin/python3"
        },
        "spark_version": "13.3.x-scala2.12"
      }
    }
  ],
  "max_concurrent_runs": 1,
  "name": "TEST",
  "tasks": [
    {
      "email_notifications": {},
      "job_cluster_key": "Job_cluster",
      "libraries": [
        {
          "whl": "dbfs:/libraries/XXX.whl"
        }
      ],
      "notebook_task": {
        "notebook_path": "/tests/...",
        "source": "WORKSPACE"
      },
      "run_if": "ALL_SUCCESS",
      "task_key": "test",
      "timeout_seconds": 0
    }
  ],
  "timeout_seconds": 0,
  "webhook_notifications": {}
}

Then what do I execute to submit: ./databricks.exe jobs submit --json '@workflow.json'

Expected Behavior

I guess it should launch a job with my configuration.

Actual Behavior

It gives me the error "Error: One of job_cluster_key, new_cluster, or existing_cluster_id must be specified."

OS and CLI version

Windows with Git Bash. Databricks CLI: 0.210.1

Is this a regression?

No idea.

Debug Logs

Output logs if you run the command with debug logs enabled. Example: databricks clusters list --log-level=debug.

time=2023-12-04T12:18:32.583+01:00 level=INFO source="root.go 55}" msg=start pid=76112 version=0.210.1 args="C:\\Users\\XXXX\\Downloads\\databricks_cli_0.210.1_windows_amd64\\databricks.exe, jobs, submit, --json, @workflow.json, --log-level, debug"
time=2023-12-04T12:18:32.610+01:00 level=DEBUG source="config_file.go 100}" msg="Loading DEFAULT profile from C:\\Users\\XXXX/.databrickscfg" pid=76112 sdk=true
time=2023-12-04T12:18:32.853+01:00 level=DEBUG source="api_client.go 219}" msg="non-retriable error: One of job_cluster_key, new_cluster, or existing_cluster_id must be specified." pid=76112 sdk=true
time=2023-12-04T12:18:32.854+01:00 level=DEBUG source="api_client.go 324}" msg="POST /api/2.1/jobs/runs/submit\n> {\n>   \"email_notifications\": {\n>     \"no_alert_for_skipped_runs\": false\n>   },\n>   \"tasks\": [\n>     {\n>       \"email_notifications\": {},\n>       \"libraries\": [\n>         {\n>           \"whl\": \"dbfs:/libraries/XXXX.whl\"\n>    
     }\n>       ],\n>       \"notebook_task\": {\n>         \"notebook_path\": \"/tests/... (11 more bytes)\",\n>         \"source\": \"WORKSPACE\"\n>       },\n>       \"task_key\": \"test\",\n>       \"timeout_seconds\": 0\n>     }\n>   ],\n>   \"timeout_seconds\": 0,\n>   \"webhook_notifications\": {}\n> }\n< HTTP/2.0 400 Bad Request\n< {\n<   \"error_code\": \"INVALID_PARAMETER_VALUE\",\n<   \"message\": \"One of job_cluster_key, new_cluster, or existing_cluster_id must be specified.\"\n< }" pid=76112 sdk=true
Error: One of job_cluster_key, new_cluster, or existing_cluster_id must be specified.
time=2023-12-04T12:18:32.855+01:00 level=ERROR source="root.go 114}" msg="failed execution" pid=76112 exit_code=1 error="One of job_cluster_key, new_cluster, or existing_cluster_id must be specified."
andrewnester commented 10 months ago

@RaccoonForever API payload for jobs submit call is different from the one returned for a job. You can look at expected payload here: https://docs.databricks.com/api/workspace/jobs/submit In particular, it does not support job_clusters in the payload, cluster configuration should be included within tasks definition.

If instead you want to trigger a run of existing job, you can use databricks jobs run-now command instead