Extend the Flask API to read from a .jsonl file of scraped Commands data and insert it into the Postgres DB.
Is your feature request related to a problem? Please describe.
As part of our data ETL pipeline, we need to transform data that has been scraped and loaded into an S3 bucket and load it into our database. This feature focuses on loading "commands" (aka employment.unit) data.
Describe the solution you'd like
Read data from the attached .jsonl file
Transform the data so that it conforms to our employment data schema
Verify that the employment data is not a duplicate of some data that was previously loaded (and update if changed)
Load the employment data into the database
Additional context
Data from a scraper is a requirement for this task. You can download that fixture here.
Extend the Flask API to read from a .jsonl file of scraped Commands data and insert it into the Postgres DB.
Is your feature request related to a problem? Please describe. As part of our data ETL pipeline, we need to transform data that has been scraped and loaded into an S3 bucket and load it into our database. This feature focuses on loading "commands" (aka employment.unit) data.
Describe the solution you'd like
Additional context Data from a scraper is a requirement for this task. You can download that fixture here.