dlt-hub / dlt

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
https://dlthub.com/docs
Apache License 2.0
2.38k stars 154 forks source link

Document a real setup, with a staging and production instance #1753

Open boxydog opened 1 month ago

boxydog commented 1 month ago

Documentation description

I have read https://dlthub.com/docs/general-usage/credentials/setup/#credential-types and https://dlthub.com/docs/general-usage/naming-convention.

I still do not understand how to set up a staging and production instance, each of which has its own state.

I wish to load incrementally, so I need a crontab running every so often (1-3 hours).

So I think I'm going to set the SOURCES__CREDENTIALS and DESTINATION__CREDENTIALS variables to postgres connection strings.

I think I also need some way to save and restore pipeline state, so the incremental state is correct even if the crontab box does not have permanent disk storage. Someone pointed me to an example gist: https://gist.github.com/rudolfix/ee6e16d8671f26ac4b9ffc915ad24b6e

But all this is a lot to figure out, and I don't feel like an actual example staging and production environment that works is well documented.

Are you a dlt user?

Yes, I'm already a dlt user.