pydiverse / pydiverse.pipedag

A data pipeline orchestration library for rapid iterative development with automatic cache invalidation allowing users to focus writing their tasks in pandas, polars, sqlalchemy, ibis, and alike.
https://pydiversepipedag.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
15 stars 2 forks source link

Enable PipedagConfig construction from dictionary #142

Closed windiana42 closed 5 months ago

windiana42 commented 6 months ago

If someone likes to programmatically produce pipedag configuration, it should be possible to construct a PipedagConfig object from a dictionary. Ideally, it is possible to input PipedagConfig.raw_config into that constructor and yield exactly the same PipedagConfig.config_dict.

An extended version of implementing this feature would offer some programmatic calls which can produce config dictionaries which are good default configurations with very little code.

Hacky workaround code as long as this feature is not available:

    pipedag_config = PipedagConfig()
    pipedag_config.raw_config["table_store_connections"][connection_name]["args"]["url"] = override_engine_url
    pipedag_config.config_dict = pipedag_config._PipedagConfig__parse_config(
        pipedag_config.raw_config
    )
    cfg = pipedag_config.get(instance)
windiana42 commented 6 months ago

Ideally this would allow for much simpler getting started documentation with programmatically created duckdb configuration:

windiana42 commented 5 months ago

Closed with #142