dlt-hub / verified-sources

Contribute to dlt verified sources 🔥
https://dlthub.com/docs/walkthroughs/add-a-verified-source
Apache License 2.0
48 stars 38 forks source link

sql_database source | error with pyarrow BE, if some of the types were not identified correctly #488

Open VioletM opened 3 weeks ago

VioletM commented 3 weeks ago

dlt version

0.4.12

Source name

sql_database

Describe the problem

When you're using pyarrow backend if some of the data types were not identified correctly, dlt outputs an error:

ValueError: array split does not result in an equal division

Expected behavior

No response

Steps to reproduce

import dlt
from sqlalchemy.dialects import registry

from sql_database import sql_table

registry.register('snowflake', 'snowflake.sqlalchemy', 'dialect')
pipeline = dlt.pipeline(
        pipeline_name="snowflake_pipeline",
        destination="duckdb",
        dataset_name="my_data",
    )

sql_alchemy_source = sql_table(
    table="PULL_REQUEST_EVENT",
    schema="DEMO_TEST_DLTHUB_REACTIONS"
)

info = pipeline.run([sql_alchemy_source])
print(info)

How you are using the source?

I'm considering using this source in my work, but bug is preventing this.

Operating system

macOS

Runtime environment

Local

Python version

3.10

dlt destination

No response

Additional information

No response