[Core feature] Add support cache for dynamic spec

flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.

Apache License 2.0

5.41k stars 580 forks source link

Motivation: Why do you think this is important?

Flyte launches a new pod to compile the dynamic task again, even if the input is the same.

@dynamic(container_image=image_spec, cache_version="1", cache=True)
def embedding_generation(shards: List[List[Annotated[np.ndarray, kwtypes(allow_pickle=True)]]]) -> List[List[FlyteDirectory]]:
    vectorstores = []
    for shard in shards:
        vectorstores.append(map_task(create_embeddings)(chunks=shard))
    return vectorstores

Use cases:

Run dynamic workflow
create_embeddings OOMs
Increase resource for create_embeddings
Rerunning a dynamic task: I expected Propeller to read the dynamicSpec (future.pb) from the cache instead of launching a pod and recompiling the workflow

Goal: What should the final outcome look like, ideally?

The propeller should return a dynamic spec from the cache if the input hasn't been changed.

Describe alternatives you've considered

Propose: Link/Inline OR Additional context

No response

Are you sure this issue hasn't been raised already?

[X] Yes

Have you read the Code of Conduct?

[X] Yes

According to the Flyte documentation, caching can be enabled for tasks using the cache and cache_version parameters. However, there are known issues with caching in dynamic workflows, as discussed in this GitHub issue. The issue states that Flyte currently does not cache intermediate results in dynamic workflows correctly, which aligns with the problem you're describing.

Would you like more details on this issue or potential workarounds?

Sources:

This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.

Mention @runllm to ask follow-up questions. See runllm.com for more!

flyteorg / flyte