PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
16.44k stars 1.59k forks source link

Log a warning/raise error if user tries to cache something undeterministic (uncacheable) #12229

Open N-Demir opened 8 months ago

N-Demir commented 8 months ago

First check

Prefect Version

2.x

Describe the current behavior

I initially thought this was a prefect bug, but after a lot of investigation I was able to determine that cloudpickle used by task_input_hash is non deterministic from run to run in lots of different scenarios. See the following issues opened on their end:

For me, the issue came from trying to pass in a class to a prefect task with caching enabled using task_input_hash. Strangely this is only deterministic if the class is first imported from another module and not defined in the same file as the script that kicks off the flow (not defined in __main__).

This single hard to debug issue has led to an incredible amount of frustration at prefect and confusion among myself and team members, and from surveying the landscape of caching in python there doesn't seem to be an incredible out of the box easier solution

Describe the proposed behavior

I'm not expert enough to know of what the best solution is or if any of the other libraries I mentioned are definitively better, but from a user's perspective I want to emphasize one simpler thing that streamlit does well that has saved me a ton of pain in the past: the UnhashableParamError

In streamlit's caching if you try to cache a custom object it will just error and you will receive a UnhashableParamError. Had prefect done this for me it would've alleviated so so much confusion about why caching wasn't working (leading me down rabbit holes of getting more confused with the distinction between prefect caching and results).

Perhaps an error is aggressive, but even a warning log would make a huge difference to debugging why from run to run prefect's caching isn't working. And maybe task_input_hash should be more restrictive in forcing user's to have hashable args instead of blindly using a nondeterministic cloudpickle as the failsafe to json

Example Use

No response

Additional context

No response

jakekaplan commented 8 months ago

Hi @N-Demir, thanks for submitting this. I'm sorry about the friction here! I also appreciate the effort you put into writing this issue.

Do you have an MRE you can share with what you were providing to task_input_hash that can show/reproduce the behavior you were seeing? Thanks!