pydiverse / pydiverse.pipedag

A data pipeline orchestration library for rapid iterative development with automatic cache invalidation allowing users to focus writing their tasks in pandas, polars, sqlalchemy, ibis, and alike.
https://pydiversepipedag.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
12 stars 2 forks source link

Add option to `Flow.run` to force execution of all tasks #166

Closed nicolasmueller closed 3 months ago

nicolasmueller commented 3 months ago

Fixes #165

Further changes:

Checklist

nicolasmueller commented 3 months ago

This PR also fixes an issue where a lazy task was not executed even when run standalone (without a stage) if it was cache valid.

NMAC427 commented 3 months ago

If I have time, I’d like to review this a bit more closely. (Everything cache related is prime bug territory)

NicolasMuellerQC commented 3 months ago

I could not find any code that copies force_task_execution to _force_task_execution.

I believe that happens in the ConfigContext.evolve function calls this: https://www.attrs.org/en/stable/api.html#attrs.evolve. It says e.g.

private attributes should be specified without the leading underscore, just like in init.

windiana42 commented 3 months ago

To make it even more complicated, we should not forget this feature when thinking about refactoring config options:

cfg = ConfigContext.get().evolve(ignore_task_version=True)

windiana42 commented 3 months ago

Closing in benefit of #170