-
This may be outside the scope of the booklet, but there is a lack of information on how to obtain data from APIs and then format that JSON data into dataframes, and then send that off somewhere.
Fo…
-
1. Build scripts to automate tasks for data ingestion, cleansing, and transformation.
2. Create and implement an automated data pipeline to optimize data processing and analytical workflows.
-
- Decade reharvesting collections list: https://docs.google.com/spreadsheets/d/1h6Kt26wmJ67n3FWZLXWXmpuB2dtPBT6fvMUqCEDw9qw/edit?gid=0#gid=0
==
- Nuxeo harvested:
- Amy to investigate options to…
-
### Apache Hop version?
2.5.0
### Java version?
17.0.7.7
### Operating system
Windows
### What happened?
I am trying to inject only two constant values from a parent pipeline into…
-
### New feature motivation
Similar to the secrets checks for the other services (lambda/ec2/ecs/etc), more checks can be implemented
### Solution Proposed
Elastic Beanstalk:
* Configuration files …
-
Currently, the ETL pipeline is a loosely federated set of scripts, AWS Glue/Athena SQL queries, and calculated fields in QuickSight. Investigate AWS options for running arbitrary Python code. Future m…
-
## Context
Once we are done completing the creation of the ETL pipeline to download, filter, and parse papers from the various sources (see #562), we need to run this pipeline for the first time to e…
-
-
Utils to autogenerate input jsons for each capsule, will also manipulate data (formatting) for capsules
Purpose is - Formatting, data manipulation for what each capsule needs
-
See [this execution](https://demo.etl.linkedpipes.com/#/pipelines/edit/canvas?pipeline=https://demo.etl.linkedpipes.com/resources/pipelines/1563878656717&execution=https://demo.etl.linkedpipes.com/res…