CodeForPittsburgh / food-access-map-data

Data for the food access map
MIT License
8 stars 18 forks source link

Add "source" field(s) to output dataset #23

Closed drewlevitt closed 4 years ago

drewlevitt commented 4 years ago

I don't want to reinvent the existing schema (at /food-data/PFPC_data_files/fields_and_descriptions.xlsx) but I do think we should add at least one new field, which I propose to call "source", simply a text field describing the original source of that row.

Could simply be the original filename or API endpoint (e.g. "2019-10-10 PGH Food Bank Site Addresses.xlsx") or a more descriptive string (e.g. "Pittsburgh Food Bank Site Addresses, submitted by Justin on 2020-02-05"), or maybe just the origin organization (e.g. "Greater Pittsburgh Community Food Bank").

In fact it might be beneficial to have both a "source_org" and a "source_file" field - because source_file (which can change over time) will aid debugging while source_org (which will not change over time) can be the basis for a prioritization list of data sources, so we can choose which data to preserve when removing duplicate sites.

drewlevitt commented 4 years ago

Added these fields to the revised schema