dlt-hub / verified-sources

Contribute to dlt verified sources 🔥
https://dlthub.com/docs/walkthroughs/add-a-verified-source
Apache License 2.0
50 stars 40 forks source link

allow for minimal table reflection level and allow for custom database types to be recognized in `sql_database` #430

Open rudolfix opened 2 months ago

rudolfix commented 2 months ago

Source name

sql_database

Describe the data you'd like to see

sql_database has two reflection levels: (1) PK + data types (2) 1 + precision information for int/varchar/binary. this is incompatible with old behavior and also is creating problems when arrow/panda data types are different from reflected data types. in the latter case we should allow dlt to infer data types from data (we are not coercing content of the data frames)

    • [ ] drop with_precision_hints and introduce reflect_table_schema with minimal, full, full_with_precision
    • [ ] add a new callback type_conversion_fallback that will be called BEFORE any sql alchemy type is converted to dlt type, the callback should let user to cast or ignore custom sql alchemy types
    • [ ] make bigquery tests to work (right now they fail on JSON fields which should be converted to string, not complex, use (2))
    • [ ] pass backend_kwargs to create_engine for special dialect options. one of the use cases is arraysize of cx_orarcle, currently we hack it like this
      def engine_from_credentials(
      credentials: Union[ConnectionStringCredentials, Engine, str]
      ) -> Engine:
      if isinstance(credentials, Engine):
      return credentials
      if isinstance(credentials, ConnectionStringCredentials):
      credentials = credentials.to_native_representation()
      return create_engine(credentials, arraysize=1)  # HERE!
    • [ ] make sure CI works
    • [ ] update README

Are you a dlt user?

Yes, I'm already a dlt user.

Do you ready to contribute this extension?

Yes, I'm ready.

dlt destination

No response

Additional information

No response