pydiverse / pydiverse.pipedag

A data pipeline orchestration library for rapid iterative development with automatic cache invalidation allowing users to focus writing their tasks in pandas, polars, sqlalchemy, ibis, and alike.
https://pydiversepipedag.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
15 stars 2 forks source link

#142: Easier configuration setup #148

Closed windiana42 closed 5 months ago

windiana42 commented 5 months ago

Currently, the only way to configure pipedag is to provide a configuration yaml. More alternatives including configuration in code should be provided.

Checklist

windiana42 commented 5 months ago

@NMAC427 FYI: we make examples using code based configuration and duckdb as a containerless database more prominent in documentation.

NMAC427 commented 5 months ago

Seems reasonable. Though I would probably rename get_basic_pipedag_config to create_basic_pipedag_config.

windiana42 commented 5 months ago

Since this PR already touched example code and configuration widely, I also included a switch from ZookeeperLockManager to DatabaseLockManager for default tests. DatabaseLockManager is the one I would recommend to everyone who is working with a database target that supports it (mssql, postgres, db2).