-
-
Convert large batch file from avro to parquet
-
Currently, the ETL pipeline is a loosely federated set of scripts, AWS Glue/Athena SQL queries, and calculated fields in QuickSight. Investigate AWS options for running arbitrary Python code. Future m…
-
![DSND_Term2_ETL_Pipeline_Preparation_ipynb_at_updation_·_rajatsharma369007_DSND_Term2](https://user-images.githubusercontent.com/18123459/205132715-cee56c81-4f66-48c2-b2a0-713b2d157144.jpg)
I am w…
-
Many ETL processes generally have more complex workflows than a simple pipeline. There are conditional branches, split/merge branches, success/error branches, etc. So you probably need a node / graph …
-
### Describe the bug
Hi,
I'm encountering an intermittent issue when using the s3.read_parquet_table function in my ETL pipeline. The pipeline reads Parquet files from S3 every 5 minutes (modin, r…
-
A [backfill](https://dagster.io/blog/backfills-in-ml) is a retroactive update to historical data. i.e. modifying the existing rows of a table, or inserting missed rows after they're discovered to be m…
-
# Overview of EL, ELT or ETL:
https://coursera.org/share/98cbe98dfaef4d453c501ef5604855b9
## EL
EL is extract and load. This refers to when data can be imported as is into a system. Examples includ…
-
I work at DBT and have been improving an ETL pipeline for gov.uk content we have based on parameters the department needs. I'd like to configure it so it ingests and overwrites data that's changed rat…
-
### Apache Hop version?
2.5.0
### Java version?
17.0.7.7
### Operating system
Windows
### What happened?
I am trying to inject only two constant values from a parent pipeline into…