dlt-hub / dlt

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
https://dlthub.com/docs
Apache License 2.0
2.65k stars 175 forks source link

Add core sources dependencies in requirements.txt #1794

Closed rahuljo closed 3 days ago

rahuljo commented 2 months ago

dlt version

0.9.9a0

Describe the problem

The general flow for the SQL Database source in versions <0.5.4 has been:

  1. Install dlt pip install dlt[duckdb]
  2. Create a dlt project dlt init sql_database duckdb
  3. Install dependencies pip install -r requirements.txt
    1. Run the pipeline python sql_database_pipeline.py

But in version 0.9.9a0, this flow is broken due to missing dependency dlt[sql_database]. The same flow now gives the following error:

Traceback (most recent call last):
  File "/Users/rahul/Desktop/tutorial_sql_database/remove/env/lib/python3.11/site-packages/dlt/common/libs/sql_alchemy.py", line 5, in <module>
    from sqlalchemy import MetaData, Table, Column, create_engine
ModuleNotFoundError: No module named 'sqlalchemy'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/rahul/Desktop/tutorial_sql_database/remove/sql_database_pipeline.py", line 10, in <module>
    from dlt.sources.sql_database import sql_database, sql_table, Table
  File "/Users/rahul/Desktop/tutorial_sql_database/remove/env/lib/python3.11/site-packages/dlt/sources/sql_database/__init__.py", line 5, in <module>
    from dlt.common.libs.sql_alchemy import MetaData, Table, Engine
  File "/Users/rahul/Desktop/tutorial_sql_database/remove/env/lib/python3.11/site-packages/dlt/common/libs/sql_alchemy.py", line 12, in <module>
    raise MissingDependencyException(
dlt.common.exceptions.MissingDependencyException: 
You must install additional dependencies to run dlt sql_database helpers . If you use pip you may do the following:

pip install "dlt[sql_database]"

Install the sql_database helpers for loading from sql_database sources. Note that you may need to install additional SQLAlchemy dialects for your source database.

Expected behavior

The generated requirements.txt should be modified to contain the dependency dlt[sql_database].

At present, this is the requirements.txt generated:

dlt[duckdb]>=0.9.9a0

Expected:

dlt[duckdb]>=0.9.9a0
dlt[sql_database]>=0.9.9a0

Steps to reproduce

  1. Install dlt pip install dlt[duckdb]
  2. Create a dlt project dlt init sql_database duckdb
  3. Install dependencies pip install -r requirements.txt
    1. Run the pipeline python sql_database_pipeline.py

Operating system

Windows

Runtime environment

Local

Python version

3.11

dlt data source

SQL Database

dlt destination

DuckDB

Other deployment details

No response

Additional information

No response