A data pipeline orchestration library for rapid iterative development with automatic cache invalidation allowing users to focus writing their tasks in pandas, polars, sqlalchemy, ibis, and alike.
Categorical columns in pandas could be materialized to / dematerialized from relational database tables if there is another table that includes the mapping from categorical IDs to actual strings.
It would even be possible to have functions for programmatically created SQL which automate the resolution via one join per categorical column.
Categorical columns in pandas could be materialized to / dematerialized from relational database tables if there is another table that includes the mapping from categorical IDs to actual strings.
It would even be possible to have functions for programmatically created SQL which automate the resolution via one join per categorical column.