[X] I used the GitHub search to find a similar issue and didn't find it.
[X] I searched the Prefect documentation for this issue.
[X] I checked that this issue is related to Prefect and not one of its dependencies.
Bug summary
We have the number of retires configurable for different tasks, but pass in some common kwargs to all tasks, including an exponential backoff for retries. When a task pulling data from an endpoint was having issues (due to the endpoint adding a more stringent rate limit), we decreased the number of retries to 0, and then noticed many 500 internal server errors.
It seems that, unlike when you specify a list of values to retry_delay_seconds (which works fine with retries=0), the exponential_backoff function causes some issues under the hood.
Reproduction
from prefect import flow, task
from prefect.tasks import exponential_backoff
@task(retries=0, retry_delay_seconds=exponential_backoff(backoff_factor=1))
def a_task():
raise ValueError()
@flow()
def a_flow():
a_task()
if __name__ == "__main__":
a_flow()
Error
File "/home/sam/arenko/flows/tmp2.py", line 11, in a_flow
a_task()
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/prefect/tasks.py", line 689, in __call__
return enter_task_run_engine(
^^^^^^^^^^^^^^^^^^^^^^
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/prefect/engine.py", line 1421, in enter_task_run_engine
return from_sync.wait_for_call_in_loop_thread(begin_run)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/prefect/_internal/concurrency/api.py", line 218, in wait_for_call_in_loop_thread
return call.result()
^^^^^^^^^^^^^
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/prefect/_internal/concurrency/calls.py", line 318, in result
return self.future.result(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/prefect/_internal/concurrency/calls.py", line 179, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/home/sam/.pyenv/versions/3.11.4/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/prefect/_internal/concurrency/calls.py", line 389, in _run_async
result = await coro
^^^^^^^^^^
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/prefect/engine.py", line 1555, in get_task_call_return_value
return await future._result()
^^^^^^^^^^^^^^^^^^^^^^
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/prefect/futures.py", line 237, in _result
return await final_state.result(raise_on_failure=raise_on_failure, fetch=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/prefect/states.py", line 91, in _get_state_result
raise await get_state_exception(state)
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/prefect/task_runners.py", line 231, in submit
result = await call()
^^^^^^^^^^^^
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/prefect/engine.py", line 1806, in begin_task_run
state = await orchestrate_task_run(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/prefect/engine.py", line 2149, in orchestrate_task_run
state = await propose_state(client, terminal_state, task_run_id=task_run.id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/prefect/utilities/engine.py", line 381, in propose_state
response = await set_state_and_handle_waits(set_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/prefect/utilities/engine.py", line 368, in set_state_and_handle_waits
response = await set_state_func()
^^^^^^^^^^^^^^^^^^^^^^
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/prefect/client/orchestration.py", line 2332, in set_task_run_state
response = await self._client.post(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/httpx/_client.py", line 1892, in post
return await self.request(
^^^^^^^^^^^^^^^^^^^
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/httpx/_client.py", line 1574, in request
return await self.send(request, auth=auth, follow_redirects=follow_redirects)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/prefect/client/base.py", line 358, in send
response.raise_for_status()
File "/home/sam/arenko/flows/.venv/lib/python3.11/site-packages/prefect/client/base.py", line 171, in raise_for_status
raise PrefectHTTPStatusError.from_httpx_error(exc) from exc.__cause__
prefect.exceptions.PrefectHTTPStatusError: Server error '500 Internal Server Error' for url 'http://ephemeral-prefect/api/task_runs/03b9ffc8-eabd-4fba-89eb-18d19f4dad5e/set_state'
Response: {'exception_message': 'Internal Server Error'}
First check
Bug summary
We have the number of retires configurable for different tasks, but pass in some common kwargs to all tasks, including an exponential backoff for retries. When a task pulling data from an endpoint was having issues (due to the endpoint adding a more stringent rate limit), we decreased the number of retries to 0, and then noticed many 500 internal server errors.
It seems that, unlike when you specify a list of values to
retry_delay_seconds
(which works fine with retries=0), the exponential_backoff function causes some issues under the hood.Reproduction
Error
Versions
Additional context
No response