PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
17.64k stars 1.65k forks source link

Encountered error while running prefect.deployments.steps.set_working_directory - FileNotFoundError #10285

Open tekumara opened 1 year ago

tekumara commented 1 year ago

First check

Bug summary

When the flow runs, Prefect tries to set a working directory to the dir on the host, used when creating the deployment, and fails.

Flow could not be retrieved from deployment.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/core.py", line 124, in run_steps
    step_output = await run_step(step, upstream_outputs)
  File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/core.py", line 95, in run_step
    result = await from_async.call_soon_in_new_thread(
  File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 292, in aresult
    return await asyncio.wrap_future(self.future)
  File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 316, in _run_sync
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/pull.py", line 28, in set_working_directory
    os.chdir(directory)
FileNotFoundError: [Errno 2] No such file or directory: '/Users/tekumara/code/prefect-demo'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 395, in retrieve_flow_then_begin_flow_run
    flow = await load_flow_from_flow_run(flow_run, client=client)
  File "/usr/local/lib/python3.10/site-packages/prefect/client/utilities.py", line 51, in with_injected_client
    return await fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/prefect/deployments/deployments.py", line 203, in load_flow_from_flow_run
    output = await run_steps(deployment.pull_steps)
  File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/core.py", line 152, in run_steps
    raise StepExecutionError(f"Encountered error while running {fqn}") from exc
prefect.deployments.steps.core.StepExecutionError: Encountered error while running prefect.deployments.steps.set_working_directory

My prefect.yaml does not include any prefect.deployments.steps.set_working_directory step:

name: prefect-demo
prefect-version: 2.11.0

# build section allows you to manage and build docker images
build: null

# the deployments section allows you to provide configuration for deploying flows
deployments:
  - name: main
    version: snapshot
    tags: ["prefect-yaml"]
    description: deployment with flow inside docker container
    entrypoint: flows/param_flow.py:param
    parameters:
      i: 1
    work_pool:
      name: kubes-pool
      job_variables:
        image: prefect-registry:5000/flow:latest
        image_pull_policy: Always
        service_account_name: prefect-flows
        finished_job_ttl: 300

I wouldn't expect this to implicitly happen if I hadn't defined a step to do so.

Reproduction

See above

Error

No response

Versions

❯ prefect version
Version:             2.11.0
API version:         0.8.4
Python version:      3.10.12
Git commit:          eeb9e219
Built:               Thu, Jul 20, 2023 4:34 PM
OS/Arch:             darwin/arm64
Profile:             default
Server type:         server

Additional context

No response

tekumara commented 1 year ago

As I workaround I have to explicitly tell prefect to set the working dir to that of my docker image:

    pull:
    # required see https://github.com/PrefectHQ/prefect/issues/10285
    - prefect.deployments.steps.set_working_directory:
        directory: /opt/prefect

Could prefect just use the the docker image's default WORKDIR instead?

martincpt commented 2 months ago

Same for me with Apple silicon. Wonder if it's an os or hardware specific issue.

martincpt commented 2 months ago

Even the Prefect staff encounter this issue. You can watch this video for reference: https://youtu.be/JKdN9bWIaSw?t=1019

It seems the root cause of this issue is the absence of properly set job variables. Specifically, the image variable is missing, which results in pulling the wrong image.

I was able to resolve this by setting the image variable, as shown in the video. If you're not using Docker Hub like me, you'll need to set your image to image_name:tag, where these values come from your prefect_docker.deployments.steps.build_docker_image settings.