dlt-hub / dlt

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
https://dlthub.com/docs
Apache License 2.0
2.69k stars 180 forks source link
data data-engineering data-lake data-loading data-warehouse elt extract load python transform

data load tool (dlt) — the open-source Python library for data loading

Be it a Google Colab notebook, AWS Lambda function, an Airflow DAG, your local laptop,
or a GPT-4 assisted development playground—dlt can be dropped in anywhere.

🚀 Join our thriving community of likeminded developers and build the future together!

Installation

dlt supports Python 3.8+.

pip install dlt

More options: Install via Conda or Pixi

Quick Start

Load chess game data from chess.com API and save it in DuckDB:

import dlt
from dlt.sources.helpers import requests

# Create a dlt pipeline that will load
# chess player data to the DuckDB destination
pipeline = dlt.pipeline(
    pipeline_name='chess_pipeline',
    destination='duckdb',
    dataset_name='player_data'
)

# Grab some player data from Chess.com API
data = []
for player in ['magnuscarlsen', 'rpragchess']:
    response = requests.get(f'https://api.chess.com/pub/player/{player}')
    response.raise_for_status()
    data.append(response.json())

# Extract, normalize, and load the data
pipeline.run(data, table_name='player')

Try it out in our Colab Demo

Features

Ready to use Sources and Destinations

Explore ready to use sources (e.g. Google Sheets) in the Verified Sources docs and supported destinations (e.g. DuckDB) in the Destinations docs.

Documentation

For detailed usage and configuration, please refer to the official documentation.

Examples

You can find examples for various use cases in the examples folder.

Adding as dependency

dlt follows the semantic versioning with the MAJOR.MINOR.PATCH pattern.

We suggest that you allow only patch level updates automatically:

Get Involved

The dlt project is quickly growing, and we're excited to have you join our community! Here's how you can get involved:

License

dlt is released under the Apache 2.0 License.