PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
17.58k stars 1.65k forks source link

importlib._bootstrap_external>", line 1073, in get_data FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpz0cmn1ddprefect/AOPS_SQL_Workflow_v1.py' #6979

Closed xbabu closed 1 year ago

xbabu commented 2 years ago

First check

Bug summary

When I use REST API with Scheduler enabled (e.g., RRULE), I get the following error in Prefect Orion 2.4. It was working fine in 2.0. Even though I have dedicated storage directory and REST API way of running workflow is pointing to default /tmp folder, not the one I have it using the parameter PREFECT_LOCAL_STORAGE_PATH. But If I run using .yml, it uses the path specified in PREFECT_LOCAL_STORAGE_PATH and it works fine. Please let me know what am I missing in the payload? Flow could not be retrieved from deployment. Traceback (most recent call last): File "", line 879, in exec_module File "", line 1016, in get_code File "", line 1073, in get_data FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpz0cmn1ddprefect/AOPS_SQL_Workflow_v1.py' (edited)

Payload used: 2022-09-24 16:40:54.170 DEBUG 18568 --- [http-nio-5080-exec-962] com.aops.waves.service.FlowService : jsonObject.toJSONString(): { "parameter_openapi_schema": { "type": "object", "title": "Parameters", "properties": { "kwargs": "{\"type\": \"string\", \"title\":\"kwargs\"}" }, "required": [ "kwargs" ] }, "infrastructure_document_id": "736c6e6f-6c03-40fa-8ea4-71f946a05343", "infra_overrides": {}, "description": "AOPS_SQL_Workflow_DS_12", "version": "1", "work_queue_name": "waves_q", "tags": [ "waves_q" ], "path": "/localpart0/aop-shared/WAVES/workflows/", "schedule": { "rrule": "DTSTART:20220924T124300\nRRULE:FREQ=HOURLY;INTERVAL=1;COUNT=1;UNTIL=20220924T124500", "timezone": "US/Eastern" }, "flow_id": "d7202f6b-b929-4139-9e4d-73e2636a3fe0", "entrypoint": "AOPS_SQL_Workflow_v1.py:AOPS_SQL_Workflow", "name": "AOPS_SQL_Workflow_DS_12", "parameters": { "kwargs": { "sqltype": 1, "date_range": "CURRENT_DATE - INTERVAL '1 months'", "dbname": "gpprod", "selection": "count()", "flow_name": "AOPS_SQL_Workflow", "rpt_flag": "1", "tab1": "whse.dim_company", "schd_run_name": "AOPS_SQL_Workflow_DS_12", "sql": "SELECT {selection} FROM {tab1} WHERE show_in_report_flag = {rpt_flag} and (creation_date > ({date_range}));" } }, API Response: { "id": "ba109826-6846-4e0e-815b-ee905c593dab", "created": "2022-09-23T19:49:43.563714+00:00", "updated": "2022-09-24T16:40:54.208488+00:00", "name": "AOPS_SQL_Workflow_DS_12", "version": "1", "description": "AOPS_SQL_Workflow_DS_12", "flow_id": "d7202f6b-b929-4139-9e4d-73e2636a3fe0", "schedule": { "rrule": "DTSTART:20220924T124300\nRRULE:FREQ=HOURLY;INTERVAL=1;COUNT=1;UNTIL=20220924T124500", "timezone": "US/Eastern" }, "is_schedule_active": true, "infra_overrides": {}, "parameters": { "kwargs": { "sql": "SELECT {selection} FROM {tab1} WHERE show_in_report_flag = {rpt_flag} and (creation_date > ({date_range}));", "tab1": "whse.dim_company", "dbname": "gpprod", "sqltype": 1, "rpt_flag": "1", "flow_name": "AOPS_SQL_Workflow", "selection": "count()", "date_range": "CURRENT_DATE - INTERVAL '1 months'", "schd_run_name": "AOPS_SQL_Workflow_DS_12" } }, "tags": [ "waves_q" ], "work_queue_name": "waves_q", "parameter_openapi_schema": { "type": "object", "title": "Parameters", "required": [ "kwargs" ], "properties": { "kwargs": "{\"type\": \"string\", \"title\":\"kwargs\"}" } }, "path": "/localpart0/aop-shared/WAVES/workflows/", "entrypoint": "AOPS_SQL_Workflow_v1.py:AOPS_SQL_Workflow", "manifest_path": null, "storage_document_id": null, "infrastructure_document_id": "736c6e6f-6c03-40fa-8ea4-71f946a05343" } "is_schedule_active": "true" } If we run a workflow for every minute for 5 minutes, it fails sporadically by showing the following error: Flow could not be retrieved from deployment. Traceback (most recent call last): File "", line 879, in exec_module File "", line 1016, in get_code File "", line 1073, in get_data FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp7v3me6ytprefect/AOPS_SQL_Workflow_v1.py' The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/localpart0/aop-shared/WAVES/prefect2/lib/python3.10/site-packages/prefect/engine.py", line 257, in retrieve_flow_then_begin_flow_run flow = await load_flow_from_flow_run(flow_run, client=client) File "/localpart0/aop-shared/WAVES/prefect2/lib/python3.10/site-packages/prefect/client/orion.py", line 82, in with_injected_client return await fn(*args, *kwargs) File "/localpart0/aop-shared/WAVES/prefect2/lib/python3.10/site-packages/prefect/deployments.py", line 70, in load_flow_from_flow_run flow = await run_sync_in_worker_thread(import_object, str(import_path)) File "/localpart0/aop-shared/WAVES/prefect2/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 57, in run_sync_in_worker_thread return await anyio.to_thread.run_sync(call, cancellable=True) File "/localpart0/aop-shared/WAVES/prefect2/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/localpart0/aop-shared/WAVES/prefect2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "/localpart0/aop-shared/WAVES/prefect2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run result = context.run(func, args) File "/localpart0/aop-shared/WAVES/prefect2/lib/python3.10/site-packages/prefect/utilities/importtools.py", line 193, in import_object module = load_script_as_module(script_path) File "/localpart0/aop-shared/WAVES/prefect2/lib/python3.10/site-packages/prefect/utilities/importtools.py", line 156, in load_script_as_module raise ScriptError(user_exc=exc, path=path) from exc prefect.exceptions.ScriptError: Script at 'AOPS_SQL_Workflow_v1.py' encountered an exception /localpart0/aop-shared/WAVES/prefect2/bin/python3 -m prefect.engine in /localpart0/aop-shared/WAVES/prefect2/tmp/tmpt45q0mvmprefect

Reproduction

{}

Error


# Copy complete stack trace and error message here, including log output if applicable.

Versions


# Copy output of `prefect version` here

Prefect Orion 2.4.2

Additional context

No response

xbabu commented 2 years ago

Please let me know of there are any workaround. This bug made our entire system down.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 30 days with no activity. To keep this issue open remove stale label or comment.

fabrice-toussaint commented 1 year ago

@xbabu how did you fix this? We are getting the same error currently