databricks / databricks-cli

(Legacy) Command Line Interface for Databricks
Other
386 stars 234 forks source link

Create Job on Windows results in MALFORMED_REQUEST #646

Closed PhilippLange closed 6 months ago

PhilippLange commented 1 year ago

Hi,

while trying to create a (very simple, stripped down) Databricks Job by using the Databricks CLI from Windows:

databricks jobs create --json '{"name":"MY_JOB","email_notifications":{"no_alert_for_skipped_runs":false},"webhook_notifications":{},"timeout_seconds":0,"max_concurrent_runs":1,"tasks":[{"task_key":"MY_TASK","notebook_task":{"notebook_path":"MY_NOTEBOOK","source":"WORKSPACE"},"job_cluster_key":"Job_cluster","timeout_seconds":0,"email_notifications":{}}],"job_clusters":[{"job_cluster_key":"Job_cluster","new_cluster":{"spark_version":"12.2.x-scala2.12","spark_conf":{"spark.databricks.delta.preview.enabled":"true"},"azure_attributes":{"first_on_demand":1,"availability":"ON_DEMAND_AZURE","spot_bid_max_price":-1},"node_type_id":"Standard_DS3_v2","spark_env_vars":{"PYSPARK_PYTHON":"/databricks/python3/bin/python3"},"enable_elastic_disk":true,"data_security_mode":"LEGACY_SINGLE_USER_STANDARD","runtime_engine":"STANDARD","num_workers":8}}],"format":"MULTI_TASK"}'

I realized this results in: Error: JSONDecodeError: Expecting value: line 1 column 1 (char 0)

The mentioned CLI Command works fine from Linux CLI.

I'm running python 3.11.3 and databricks-cli==0.17.7.

AkshataHegde-ZS commented 1 year ago

Hi @PhilippLange I'm also encountering the same issue on my Windows machine. Have you made any progress in finding a solution to this problem? I'm aware that, can use databricks jobs create --json-file <configFile>.json However, I'm working on a use case that prevents me from using this command as a workaround.

PhilippLange commented 1 year ago

Hi @AkshataHegde-ZS, I was not able to find a solution. BR Philipp

AkshataHegde-ZS commented 1 year ago

A bit of Googling later, found out that need to "escape" the double quotes with a backslash within the JSON string. Because under the hood, the Databricks CLI is using the json.loads() method to parse our --json argument, and the error we're getting is a JSONDecodeError coming from that json package.

So by adding \ as a escape character was able to create a job succcessfully. ex:


databricks jobs create --json '{\"name\":\"MY_JOB\",\"email_notifications\":{\"no_alert_for_skipped_runs\":false},\"webhook_notifications\":{},\"timeout_seconds\":0,\"max_concurrent_runs\":1,\"tasks\":[{\"task_key\":\"MY_TASK\",\"notebook_task\":{\"notebook_path\":\"MY_NOTEBOOK\",\"source\":\"WORKSPACE\"},\"job_cluster_key\":\"Job_cluster\",\"timeout_seconds\":0,\"email_notifications\":{}}],\"job_clusters\":[{\"job_cluster_key\":\"Job_cluster\",\"new_cluster\":{\"spark_version\":\"12.2.x-scala2.12\",\"spark_conf\":{\"spark.databricks.delta.preview.enabled\":\"true\"},\"azure_attributes\":{\"first_on_demand\":1,\"availability\":\"ON_DEMAND_AZURE\",\"spot_bid_max_price\":-1},\"node_type_id\":\"Standard_DS3_v2\",\"spark_env_vars\":{\"PYSPARK_PYTHON\":\"/databricks/python3/bin/python3\"},\"enable_elastic_disk\":true,\"data_security_mode\":\"LEGACY_SINGLE_USER_STANDARD\",\"runtime_engine\":\"STANDARD\",\"num_workers\":8}}],\"format\":\"MULTI_TASK\"}'