flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.41k stars 580 forks source link

[Core feature] Add support cache for dynamic spec #5543

Open pingsutw opened 2 months ago

pingsutw commented 2 months ago

Motivation: Why do you think this is important?

Flyte launches a new pod to compile the dynamic task again, even if the input is the same.

@dynamic(container_image=image_spec, cache_version="1", cache=True)
def embedding_generation(shards: List[List[Annotated[np.ndarray, kwtypes(allow_pickle=True)]]]) -> List[List[FlyteDirectory]]:
    vectorstores = []
    for shard in shards:
        vectorstores.append(map_task(create_embeddings)(chunks=shard))
    return vectorstores

Use cases:

  1. Run dynamic workflow
  2. create_embeddings OOMs
  3. Increase resource for create_embeddings
  4. Rerunning a dynamic task: I expected Propeller to read the dynamicSpec (future.pb) from the cache instead of launching a pod and recompiling the workflow

Goal: What should the final outcome look like, ideally?

The propeller should return a dynamic spec from the cache if the input hasn't been changed.

Describe alternatives you've considered

NA

Propose: Link/Inline OR Additional context

No response

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

runllm[bot] commented 2 months ago

According to the Flyte documentation, caching can be enabled for tasks using the cache and cache_version parameters. However, there are known issues with caching in dynamic workflows, as discussed in this GitHub issue. The issue states that Flyte currently does not cache intermediate results in dynamic workflows correctly, which aligns with the problem you're describing.

Would you like more details on this issue or potential workarounds?

Sources:


This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.

Mention @runllm to ask follow-up questions. See runllm.com for more!