pydiverse / pydiverse.pipedag

A data pipeline orchestration library for rapid iterative development with automatic cache invalidation allowing users to focus writing their tasks in pandas, polars, sqlalchemy, ibis, and alike.
https://pydiversepipedag.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
15 stars 2 forks source link

Automatically add ExternalTableReference to auto_table list #150

Open windiana42 opened 5 months ago

windiana42 commented 5 months ago

It would make sense if a task could simply return an ExternalTableReference without the surrounding pipedag.Table() object. This can be realized by simply adding the class to the auto_table configuration. It would be strange to ask users to set this sensibly which is why it could also be automatically added in the ConfigContext property:

    @cached_property
    def auto_table(self) -> tuple[type, ...]:
        return tuple(map(import_object, self._config_dict.get("auto_table", ())))

It might be nice to change the name of the property a bit (i.e. enriched_auto_table) such that it is not assume that its content is identical to the auto_table config_dict entry.

windiana42 commented 5 months ago

@NMAC427 @nicolasmueller any objection to go with:

    @cached_property
    def effective_auto_table(self) -> tuple[type, ...]:
        return tuple(*map(import_object, self._config_dict.get("auto_table", ())), ExternalTableReference)