Closed austinweisgrau closed 1 year ago
@austinweisgrau you'll need to propagate context variables if you're going to use the context across threads you are managing yourself e.g. https://github.com/PrefectHQ/prefect/blob/main/src/prefect/_internal/concurrency/executor.py#L76-L77
@madkinsz Can you demonstrate how that would be accomplished? If I swap out the import in my example with from prefect._internal.concurrency.executor import ThreadPoolExecutor
I still get the same error.
A meta-question here is, is using ThreadPoolExecutor within a prefect task an anti-pattern? Is there a preferred implementation for nested concurrency within a prefect task? I'm confused about how to implement nontrivially structured nested concurrent functions within prefect, the docs don't seem to discuss it. (slack thread here)
@austinweisgrau you could import the Executor
from there — if you import ThreadPoolExecutor
you're just importing the standard library one. Those are internal models though, so I can't recommend you use them in production. You could accomplish this with your previous example like so:
from concurrent.futures import ThreadPoolExecutor
import contextvars
from prefect import flow, get_run_logger, task
def concurrent_subtask() -> None:
# This raises MissingContextError
get_run_logger()
@task
def basic_task():
get_run_logger().info("This works.")
context = contextvars.copy_context()
with ThreadPoolExecutor(max_workers=2) as executor:
futures = []
for _ in range(2):
future = executor.submit(context.run, concurrent_subtask)
futures.append(future)
for future in futures:
future.result()
@flow
def helloworld() -> None:
basic_task()
if __name__ == "__main__":
helloworld()
A meta-question here is, is using ThreadPoolExecutor within a prefect task an anti-pattern?
Basically, yeah. Prefect tasks are intended to be the smallest level of concurrency. We provide task runners to manage concurrent execution of tasks. Using additional concurrency mechanisms within tasks isn't recommend, but probably it's okay if you're just running your tasks sequentially / locally. You just won't have access to the Prefect context while managing concurrency yourself unless you copy it explicitly.
Got it, thanks!
First check
Bug summary
prefect.get_run_logger() raises MissingContextError when called in a method called by concurrent.futures.ThreadPoolExecutor.
Reproduction
Error
Versions
Additional context
No response