dlt-hub / dlt

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
https://dlthub.com/docs
Apache License 2.0
2.73k stars 181 forks source link

Update sql_database source documentation to clarify resource usage #2069

Closed dat-a-man closed 1 week ago

dat-a-man commented 1 week ago

When using the sql_database source with .with_resources, schema reflection occurs for all tables in the schema, not just the specified ones. This behavior leads to excessive queries and significant performance degradation for schemas with many tables.

This aspect is not clear in the docs.

Workaround To limit schema reflection to specific tables, use the table_names argument in sql_database instead of .with_resources:

source = sql_database(connection_string, table_names=["table_1"])

This approach ensures that only the specified tables are reflected and avoids unnecessary overhead.