dlt-hub / verified-sources

Contribute to dlt verified sources 🔥
https://dlthub.com/docs/walkthroughs/add-a-verified-source
Apache License 2.0
72 stars 50 forks source link

Airflow is not able to run dlt pipeline with sql database as source #563

Open vvijayan-suki opened 2 months ago

vvijayan-suki commented 2 months ago

dlt version

0.5.3

Source name

sql_database

Describe the problem

I'm trying to run a pipeline with a verified sql database as source and filesystem (local) as destination

DAG run in airflow is failing with the following error

[2024-08-24, 19:50:01 IST] {taskinstance.py:3301} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.12/site-packages/dlt/extract/pipe_iterator.py", line 274, in _get_source_item
    pipe_item = next(gen)
                ^^^^^^^^^
  File "/opt/airflow/dags/sql_database/helpers.py", line 211, in table_rows
    default_table_adapter(table, included_columns)
  File "/opt/airflow/dags/sql_database/schema_types.py", line 50, in default_table_adapter
    if isinstance(sql_t, sqltypes.Uuid):
                         ^^^^^^^^^^^^^
AttributeError: module 'sqlalchemy.sql.sqltypes' has no attribute 'Uuid'

From my initial triaging, Uuid is a new type in SQLAlchemy introduced in >=2.0.0, while airflow (the latest version being 2.10.0) uses SQLAlchemy 1.4.36

Expected behavior

The DAG run should run successfully and not error out.

Steps to reproduce

  1. dlt init sql_database
  2. use sql_table to load data
  3. deploy the pipeline to airflow (docker)
  4. run the DAG

How you are using the source?

I'm considering using this source in my work, but bug is preventing this.

Operating system

macOS

Runtime environment

Docker, Docker Compose

Python version

3.12

dlt destination

filesystem

Additional information

No response