Open jverhoeks opened 6 months ago
Awesome! Would greatly fit into https://github.com/datacontract/datacontract-cli/blob/main/datacontract/export/sql_converter.py for different (all supported server types)
Do you have some time to prepare a PR?
What kind of SQL data types you use in the datacontract format? Is that duckdb based?
The Data Contract Specification defines these data types: https://datacontract.com/#data-types
I have finished my other PR https://github.com/datacontract/datacontract-cli/pull/166 I'll work on this now for both import and export
Can you elaborate how you would implement this here? We do have a lot of needs for the type mappings for various exports, and sometimes we even export sql. I am not totally convinced how this helps here to be honest.
As most tapes are very similar to snowflake types, one could use those and convert to others. But there are exceptions where we have special type mappings for other technologies. If we map to one sql dialect and then from there, we no longer have access to the original information.
I'm working on a small demo where i use the AST from sqlglot to build a generic import & export (https://github.com/tobymao/sqlglot?tab=readme-ov-file#ast-introspection)
sqlfile + dialect -> AST -> datacontract datacontract -> AST -> sqlfile+dialect
Starting to work on some test code: https://github.com/jverhoeks/test_sqlparse
Datatype mapping is in progress
The challenge with sqlglot is to overwrite the type (or other properties) from the config object https://datacontract.com/#config-object for the target sql dialect.
@jverhoeks's approach sounds interesting, if we find a way to specify the type for the generation process.
SQLglot is a great python library to read, parse , translate and write 21 different sql dialects.
https://github.com/tobymao/sqlglot
It supports 21 dialects. This make it able to create the schema in 1 sql language and then translate to the others. Where sqlglot takes care of the syntax. It can also read any create table sql statement and convert into a standard AST for import parsing.
Output