moj-analytical-services / airflow-matrix-scraper

scraper for matrixbooking api
0 stars 2 forks source link

Extend booking types | Request from Laurence Droy #21

Closed andrewc-moj closed 6 months ago

andrewc-moj commented 1 year ago

Laurence requested to see all booking types from Matrix. Test notebook to view results from API call

andrewc-moj commented 11 months ago

Hey Thomas - appreciate this isn't your area anymore but would appreciate some feedback on this PR as it stands. It's not quite ready for merging as the main script is looking to update the 'app database'.

Running the script You can run the amended 'main' script like this: python python_scripts/main.py -e dev --scrape_date yyyy-mm-dd with no trouble at all.

This will write the (modified) parquet files to a new path db/dev which a new database matrix_db_dev will pick up.

Defining the new database The new database is defined in python_scripts/database_builder_v2.py.
My plan was to modify this script to take the dev/prod as an argument and set up the database accordingly.

Changes made I've commented against all changes so you know why I'm doing them.

Specific areas for feedback There are things I'd like your view on:

Any other feedback would be welcome around conventions you'd expect people to follow in data engineerring

andrewc-moj commented 10 months ago

Matrix scraper - Respond to Thomas' PR comments

andrewc-moj commented 9 months ago

preprod DAG for running scraper - going back 12 months