databricks / databricks-sdk-py

Databricks SDK for Python (Beta)
https://databricks-sdk-py.readthedocs.io/
Apache License 2.0
318 stars 103 forks source link

[ISSUE] "temporarily unavailable" errors are not retried #659

Open jan-kouba opened 1 month ago

jan-kouba commented 1 month ago

Description I got this error after 2 seconds when I tried to run a job:

InternalError: The service at /api/2.1/jobs/runs/get?run_id=<id> is temporarily unavailable. Please try again later. [TraceId: -]

Reproduction

w = WorkspaceClient()
job_timeout = datetime.timedelta(hours=3.0)
result = w.jobs.run_now(job_id).result(timeout=job_timeout)

Expected behavior When the API server returns temporarily unavailable error, it should be retried by the SDK as is described in the README (retry_timeout_seconds) and not fail quickly (after 2 seconds in my case).

Is it a regression? I don't know

Debug Logs I can't reproduce this, because I can not make the Workspace API to return particular error, so no debug logs.

Other Information

Additional context