Open williamjamir opened 4 months ago
hi @williamjamir - this is actually the intended behavior of quote
, i.e. it skips all introspection of the incoming value, which includes resolving futures.
what's your use case for quote
here? perhaps you want something like: task3(quote(items[0].result()))
Thanks for the quick reply @zzstoatzz !
I just wanted to clarify that my primary use case is actually task 2. I only did task 3 to demonstrate that the behavior was consistent (using the output of one map as the input for another map).
Here's my specific use case: I'm extracting multiple dataframes from API calls and then passing these dataframes through my filter/business logic before uploading them.
@flow
def myflow():
# Extracting
dfs = extract_dfs.map(my_params_and_inputs)
# Filters from my flow
cleaned_data = filters_and_businnes_logic.map(dfs)
# Rest of my flow
this is actually the intended behavior of quote i.e. it skips all introspection of the incoming value, which includes resolving futures.
Just as a suggestion, it's not clear from the warning raised by Prefect (or by the docs) that futures are not resolved when using quote, only the introspection aspected is mentioned.
Task parameter introspection took 190.735 seconds , exceeding
`PREFECT_TASK_INTROSPECTION_WARN_THRESHOLD` of 10.0.
Try wrapping large task parameters with `prefect.utilities.annotations.quote`
for increased performance, e.g. `my_task(quote(param))`.
To disable this message set `PREFECT_TASK_INTROSPECTION_WARN_THRESHOLD=0`
In my view, the warning message leads us to think that the most "obvious" solution would be to use the quote
.
Like the example bellow:
@flow
def myflow():
# Extracting
dfs = extract_dfs.map(my_params_and_inputs)
# Filters from my flow
cleaned_data = filters_and_businnes_logic.map(quote(dfs))
Which, in this case, doesn't work =/
Would the team consider perhaps moving this logic to the quote implementation to handle such scenarios? Because I clear see a misuse from the quote API, end even worst the propagation of this unsolved Future to others tasks, like this:
@task()
def task1(value: str) -> list[str]:
print('Inside task1 with value: ', value)
return f'{value}-processed'
@task()
def task2(value: str) -> list[str]:
print('Inside task2 with value: ', value)
return f'{value}-final'
@task()
def task3(value: str) -> list[str]:
print('Inside task3 with value: ', value)
@flow(flow_run_name='Example flow')
def run_flow():
items = task1.map(quote(['acme', 'bar']))
items_2 = task2.map(quote(items))
task3.map(items_2)
>>> run_flow()
Inside task1 with value: acme
Inside task2 with value: PrefectFuture('task1-0')
Inside task2 with value: PrefectFuture('task1-1')
Inside task3 with value: PrefectFuture('task1-0')-final
Inside task1 with value: bar
Inside task3 with value: PrefectFuture('task1-1')-final
Okay, so to confirm, to handle this case correctly I should call result before passing it to the quote?
@flow
def myflow():
# Extracting
dfs = extract_dfs.map(my_params_and_inputs)
# Filters from my flow
cleaned_data = filters_and_businnes_logic.map(quote([i.result() for i in dfs]))
Or should it be the job of the task filters_and_businnes_logic to check for PrefectFuture and call .result?
@task
def filters_and_businnes_logic(df):
if isinstance(value, PrefectFuture):
df = df.resut()
# ... implementation
Okay, so to confirm, to handle this case correctly I should call result before passing it to the quote?
Or should it be the job of the task filters_and_businnes_logic to check for PrefectFuture and call .result?
I would say whichever way makes most sense to you works fine, I dont think there's a strong practical difference between resolving the futures outside or inside the task (I might lean toward the former, i.e. outside, to keep tasks portable as pure python).
Agreed on the callout around the warning message, we could make it more clear that adding quote
would prevent futures from being resolved - can update that.
First check
Bug summary
I've noticed that prefect.utilities.annotations.quote doesn't work as expected when it uses the output from a map.
In the example below, you can see that the value is
PrefectFuture
rather than the value.task1.map
withitems = [task1.submit(i) for i in ['acme', 'bar']]
, everything also works fine.Reproduction
Error
Versions
Additional context
No response