debussy-labs / debussy_concert

Debussy is an opinionated Data Architecture and Engineering framework, enabling data analysts and engineers to build better platforms and pipelines.
Apache License 2.0
28 stars 4 forks source link

Possibility to do a partial reprocessing of a DAG #35

Open NiltonDuarte opened 1 year ago

NiltonDuarte commented 1 year ago

The problem: Currently, our examples are heavily dependent on dag_run.start_date as the metadata used for _ingestion_ts And this add an issue that we can't have a partial reprocessing as the dag_run.start_date will be the start date of the reprocessing run and the framework wont be able to correctly find the parquet files on the currect directory _logical_ts={{ execution_date }}/_ingestion_ts={{ dag_run.start_date }}

NiltonDuarte commented 1 year ago

Using the XCom as done in the PR https://github.com/DotzInc/debussy_concert/pull/34 in the bitcoin ingestion example solves this problem but I do not like the solution