pydiverse / pydiverse.pipedag

A data pipeline orchestration library for rapid iterative development with automatic cache invalidation allowing users to focus writing their tasks in pandas, polars, sqlalchemy, ibis, and alike.
https://pydiversepipedag.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
15 stars 2 forks source link

Would it be feasible to auto-detect input_type parameter to @materialize in simple cases? #35

Open windiana42 opened 1 year ago

windiana42 commented 1 year ago

Would it be feasible to auto-detect input_type parameter to @materialize in simple cases such as the following:

@materialize(lazy=True)
def lazy_task_2(input: sa.Table):
    return Table(sa.select([(input.c.x * 5).label("x5")]), name="task_2_out", primary_key=["x5"])
windiana42 commented 1 year ago

Maybe even lazy parameter could be autodetected from function signature.

windiana42 commented 1 year ago

Improving error messages for failures to comply with underlying assumptions might also help. This is a very common error case since it is from programmer point of view redundant information which easily gets inconsistent.