PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
15.95k stars 1.57k forks source link

Retrieving the output of runs with multiple tasks is not supported (jobs_runs_submit_by_id) #13127

Open milenkobeslic opened 1 year ago

milenkobeslic commented 1 year ago

When using jobs_runs_submit_by_id_and_wait_for_completion with a databricks job with multiple tasks, we get the following error:

Finished in state Failed('Flow run encountered an exception. Traceback (most recent call last):\n File "/usr/local/lib/python3.9/site-packages/prefect_databricks/rest.py", line 149, in _unpack_contents\n response.raise_for_status()\n File "/usr/local/lib/python3.9/site-packages/httpx/_models.py", line 749, in raise_for_status\n raise HTTPStatusError(message, request=request, response=self)\nhttpx.HTTPStatusError: Client error \'400 Bad Request\' for url \'https://{removed}.cloud.databricks.com/api/2.1/jobs/runs/get-output?run_id=4816808\'\nFor more information check: https://httpstatuses.com/400\n\nThe above exception was the direct cause of the following exception:\n\nhttpx.HTTPStatusError: A job run with multiple tasks was provided.JSON response: {\'error_code\': \'INVALID_PARAMETER_VALUE\', \'message\': \'Retrieving the output of runs with multiple tasks is not supported. Please retrieve the output of each individual task run instead.\'}\n')

All the task in the job and the job itself completes successfully but the flow stops at this point.

Expectation / Proposal

The flow should be marked as completed if all tasks are successfully completed and fail if any of the tasks fails.

zerodarkzone commented 1 year ago

This is fixed by https://github.com/PrefectHQ/prefect-databricks/pull/75