PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
15.85k stars 1.55k forks source link

Context documentation incorrectly has ‘import prefect.context’ which is invalid #4977

Closed marvin-robot closed 2 years ago

marvin-robot commented 3 years ago

Opened from the Prefect Public Slack Community

kpweiler: Hi there - I have a flow where I would like to (at flow run time) get a date (as a string) from a function defined in my codebase and provide it to ALL of the tasks in the flow. This sounds like a Parameter, but there are a couple problems with this:

  1. a function call is not JSON serializable
  2. I don’t really want every single task to depend on one node that is just getting a string - the graph view would be extremely cluttered Any ideas on how to accomplish this?

kevin701: Hey <@UMSQA1W8Z>, you can:

  1. Create a task to return the date. The parameter can be something like “today” or “days_from_today=0", then the task will calculate it and return it.
  2. Maybe pull it somehow from the prefect context
  3. Store your flow as a script and calculate it at the top (like after imports). Use the variable in your tasks directly (I think this will work). <https://docs.prefect.io/orchestration/flow_config/storage.html#pickle-vs-script-based-storage|Docs on pickle vs flow based storage>

kpweiler: thanks <@U01QEJ9PP53> - if I do #1 - every task that needs this date will have a graph connection the the parameter and the downstream task right?

kevin701: Context has `prefect.context.get(“today”). See https://docs.prefect.io/api/latest/utilities/context.html|this .

kevin701: Yep that’s right

kpweiler: ok - I think i’ll give the script storage a try - I’m using Docker Storage so it looks like that is supported

kpweiler: that’s gonna need a big refactor I guess though because I’m currently using the imperative API and some builder functions to build the flow. I guess I could generate the script with a Jinja Template

kevin701: Maybe pulling it from context is the easiest then?

kpweiler: yeah - having a look

The story here is that we have a flow we want to run once per day, but it might take longer than a day to complete. I could tell each individual task to ask what day it is - but if it goes past 24 hours - it’s going to run flow steps with the wrong date

kevin701: The context won’t change once the flow is running, but there is also scheduled_start_time

kpweiler: Does this context time return in the system’s time zone or UTC?

kevin701: Ah good question. It might be UTC. Let me check

kevin701: It is UTC

kpweiler: ok - I actually think it’s a moot point because we’ll be running at 4PM Chicago time - which is variously 21 or 22 UTC, but the same date

kpweiler: if you re-run a flow from failed, do you get a new context?

kevin701: all of the fields are the same except the ones around timestamps which change. scheduled_start_time would be the same though

kevin701: https://github.com/PrefectHQ/prefect/blob/master/src/prefect/engine/flow_runner.py#L180-L189|this is the list of things that change

kpweiler: cool yeah, so scheduled_start_time it is

kpweiler: somewhat separately - is this documentation correct? https://docs.prefect.io/api/latest/utilities/context.html

I can’t seem to import prefect.context

kpweiler: I’m on 0.14.7 but the 0.14.22 documentation looks to be the same

kevin701: You import prefect and then

prefect.context.get("scheduled_start_time")

inside the task

kpweiler: do you do the import in the task itself also?

kevin701: no need. import at the top of the script.

kevin701:

import prefect
from prefect import task, Flow

@task()
def log_stuff(x):
    logger = prefect.context.get("logger")
    <http://logger.info|logger.info>(prefect.context.get("scheduled_start_time"))
    return x

kpweiler: yeah - i think that doc is incorrect - you can’t

import prefect.context

just an FYI - nbd

kevin701: Ah crud I see your point.

kevin701: <@ULVA73B9P> open “Context documentation incorrectly has ‘import prefect.context’ which is invalid”

Original thread can be found here.

kvnkho commented 3 years ago

No need to read the above. This is the page with the wrong code snippet: https://docs.prefect.io/api/latest/utilities/context.html

kvnkho commented 2 years ago

The same link says that Prefect Cloud supplies the following.., making it seem like Server does not.