pydiverse / pydiverse.pipedag

A data pipeline orchestration library for rapid iterative development with automatic cache invalidation allowing users to focus writing their tasks in pandas, polars, sqlalchemy, ibis, and alike.
https://pydiversepipedag.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
12 stars 2 forks source link

Allow custom identifier names #200

Open DominikZuercherQC opened 1 month ago

DominikZuercherQC commented 1 month ago

Currently pipedag assembles the identifier for the primary key as pk etc. For tables with long column names and primary keys that are made up of many columns the identifier becomes to long.

On MSSQL Server this triggers sqlalchemy.exc.ProgrammingError: (pyodbc.ProgrammingError) ('42000', "[42000] [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]The identifier that starts with 'pk_xxxxx' is too long. Maximum length is 128. (103) (SQLExecDirectW)")

I think this issue could be solved by allowing the user to pass a custom name for the primary key identifier