flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.82k stars 660 forks source link

[Core feature] Eager should support executing tasks/workflows by reference without having access to the original function #4673

Open pryce-turner opened 11 months ago

pryce-turner commented 11 months ago

Motivation: Why do you think this is important?

This feature would streamline the workflow development process by enabling local execution of a workflow while tasks are in fact being run in a flyte cluster with access to GPUs, agents, plugins etc.

Goal: What should the final outcome look like, ideally?

This feature should enable something like the following:

from flytekit.experimental import eager
from flytekit.types.directory import FlyteDirectory
from flytekit.remote.remote import Config, FlyteRemote

remote = FlyteRemote(
  config=Config.auto(config_file="config.yaml"),
  default_project="project_name",
  default_domain="development",
  )

remote_workflow = remote.fetch_launch_plan(
  project="project_name", name="workflow_name", version="version_number"
  )

@eager(remote=remote)
async def toy_example(directories: FlyteDirectory) -> bool:
  result = await remote_workflow(directories=[directories])
  return 0

More concisely, a new class e.g. import_launchplan could be used inside of the eager workflow and inherit it's remote definition to fetch things: result = await import_launchplan('lp_name')(directories=[directories])

And be executable with:

pyflyte --config config.yaml run --project project_name toy_example.py toy_example --directories ./

which currently return: Remotely fetched entities cannot be run locally. Please mock the workflows.remote_workflow.execute.

Describe alternatives you've considered

It's possible to use pyflyte run --remote followed by pyflyte fetch to get the assets back locally, but that's a more manual process.

It would also be possible to define everything in a remote context within a python script, including the fetch, but that's developing further away from production.

Propose: Link/Inline OR Additional context

Could be related to: https://github.com/flyteorg/flyte/pull/4570

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

dosubot[bot] commented 11 months ago

🤖

This sounds like a great addition to flyte! Your proposed feature would definitely streamline the workflow development process and make it more efficient. It's clear that you've put a lot of thought into the potential implementation and the benefits it would bring.

I would encourage you to go ahead and create a pull request for this feature. The community and maintainers will appreciate your contribution and can provide feedback on the proposed changes. Keep up the great work!


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.