fg91 commented 3 weeks ago

Describe the bug

As a user, I would expect that the caching behaviour is the same when executing a workflow in a cluster vs executing it locally as a python script.

In practice, there are situations where the behaviour differs:

Expected behavior

[ ] Caching of tasks without return values:
```
@task(cache=True, cache_version="1.0")
def foo() -> None:
    print("Foo")
```
Locally, this task can be cached while in a cluster execution it can't be. Flyteconsole says "Caching was disabled for this execution".

As a user, I have a strong preference for being able to cache tasks without a return value as tasks can have side effects (like e.g. storing a resulting metric in a metadata store) which don't need a return value but are still supposed to be cached. We have multiple tasks in our code base that have a dummy return value only to allow the task to be cached.

[ ] Cache misses upon schema changes:

from dataclasses import dataclass
from dataclasses_json import dataclass_json
from flytekit import task, workflow

@dataclass_json
@dataclass
class Foo:
    a: int
    # b: int

@task(cache=True, cache_version="1.0")
def t1() -> Foo:
    print("Foo")
    return Foo(a=42)  #, b=42)

@workflow
def wf():
    t1()

if __name__ == "__main__":
    wf()

When executing this workflow, adding b: int to Foo as an example of a schema change, and executing again, there is an expected cache miss in the remote execution but an unexpected cache hit in the local execution. The local behaviour needs to be adapted.

Additional context to reproduce

No response

Screenshots

No response

Are you sure this issue hasn't been raised already?

[X] Yes

Have you read the Code of Conduct?

[X] Yes

fg91 commented 3 weeks ago

In case anyone observes another situation where the behaviour differs, feel free to add to this issue.

popojk commented 6 hours ago

flyteorg / flyte

[BUG] Local and remote caching behaviour should not differ #5837