Closed maxachis closed 2 years ago
Mark Z and Max will take a look at this.
Before I go any further with this script, I want to confirm that the merged_dataset.csv isn't missing any source orgs or source files currently. If not, my script will check to make sure the dataset contains the following:
source_org
1 Grow Pittsburgh
2 USDA Food and Nutrition Service
3 PA WIC
4 Allegheny County
5 FMNP Markets
6 Greater Pittsburgh Community Food Bank
7 Just Harvest
source_file
1 GP_garden_directory_listing-20210322.csv
2 https://services1.arcgis.com/RLQu0rK7h4kbsBq5/arcgis/rest/services/Store_Locations/FeatureServer
3 wicresults.json
4 https://services1.arcgis.com/vdNDkVykv9vEWFX4/arcgis/rest/services/Child_Nutrition/FeatureServer
5 https://services5.arcgis.com/n3KaqXoFYDuIhfyz/ArcGIS/rest/services/FMNPMarkets/FeatureServer
6 https://services1.arcgis.com/vdNDkVykv9vEWFX4/arcgis/rest/services/COVID19_Food_Access_(PUBLIC)/FeatureServer
7 Just Harvest Google Sheets
I may also want to add sanity checking to ensure that each flag column contains both 0's and 1's. They don't all have to be 0's and 1's -- I can see some scenarios where something being "NA" is fine, but we'd probably avoid problems by simply ensuring that any flag column isn't ALL NA's.
Ellie gave the approval for all of the above, so Max, go ahead and put together this sanity checking script with the given parameters!
I have created a pull request for this!
https://github.com/CodeForPittsburgh/food-access-map-data/pull/206
In addition to the logic of the sanity checking script, I should note that commands for adding and pushing to Git have been moved to the run.sh shell script from the Github Actions yaml file for generate_merged_datasets, so that they can properly be controlled by the sanity checking script.
At any rate, have a look at it and let me know if it looks good for merging!
Basically, check to make sure the merged_dataset.csv file:
And if those conditions aren't met, throw an error and don't add/commit.
Have this run within the rest of run.sh, after all the data prep scripts are occurring but before the final add/commit.