NASA-PDS / web-analytics

Other
0 stars 0 forks source link

Design and Implement Analytics Pipeline in Amazon Glue for Python Code Execution #34

Open kaipak opened 3 months ago

kaipak commented 3 months ago

Currently, the ETL pipeline is a loosely federated set of scripts, AWS Glue/Athena SQL queries, and calculated fields in QuickSight. Investigate AWS options for running arbitrary Python code. Future migrations might implement Airflow or other pipeline framework, so the solution here may be temporary or should be designed to be specific, and relatively low effort/complexity.

The solution should adhere to following

Sub-tasks

jordanpadams commented 2 months ago

@kaipak how did you create this ticket? I notice there are no labels or anything associated with it, so it appears to have not been created using a template.

jordanpadams commented 2 months ago

📆 03/2024 status: In work. On schedule.

kaipak commented 2 months ago

@jordanpadams I was about to start work on this ticket this week, I can recreate it if it's causing issues with CI/CD.

jordanpadams commented 2 months ago

@kaipak nope. you are good to go.

jordanpadams commented 1 month ago

📆 04/2024 status: In work. On schedule.

jordanpadams commented 1 month ago

📆 05/2024 status: Task pack due to lack of resources and having to roll task team off the project. Hoping to restart towards end of build. Manual workaround available, so no impact on build.