dlt-hub / verified-sources

Contribute to dlt verified sources 🔥
https://dlthub.com/docs/walkthroughs/add-a-verified-source
Apache License 2.0
48 stars 38 forks source link

Sql database reflection level #487

Open steinitzu opened 3 weeks ago

steinitzu commented 3 weeks ago

Tell us what you do here

Short description

Test pass on dlt devel locally. May not on release version currently.

Related Issues

Additional Context

steinitzu commented 2 weeks ago

this 90% good.

1. we have a related bug [sql_database source | error with pyarrow BE, if some of the types were not identified correctly #488](https://github.com/dlt-hub/verified-sources/issues/488) see in the review

I added some logic in row_tuples_to_arrow to skip unknown type columns. They are returned without data type in dlt schema to preserve the index. The other backends type is still inferred where possible.

2. please add a test where we have an unsupported type like GEOMETRY in the table. let's try to load it with and without type adapter

Added a kinda massive parametrized test for all backends/type adapter/reflection levels. Using DATERANGE and ARRAY columns. Didn't want to require postgis or anything and make the test setup more complicated. But could be nice to test with snowflake source since they use all custom types that don't inherit regular sqla types.

Type adapter can work in some cases, but for instance trying to load the DATERANGE object to arrow as complex/text type does not work. We would need to run through the data and coerce values for that which would be a big performance hit.

3. we need docs update with examples of type adapter, same thing in sql database pipeline demo

https://github.com/dlt-hub/dlt/pull/1467