Note: These are all manual or semi-manual tasks to fill in gaps in the existing info sheets. In some cases, this can be facilitated by the new facility crosswalk updating functions, but those are intended to be a process for adding new spellings/facilities, rather than updating the existing sheets.
Make sure only one person is making updates at a time and that everyone is working off of the latest and greatest sheets for version control!
Priority: Start with facilities that are in the latest scraped data (since these show up on the website). Eventually, we'll also want to do this with the historical data.
Some facilities in the scraped data already exist in fac_data and just need missing data populated
Other facilities do NOT already exist and will need to be added as new facilities or spellings (~350)
Facilities missing states?
[x] Assign states to federal facilities missing states where possible (~150)
Facilities missing jurisdiction?
[x] Assign states to facilities missing jurisdictions where possible (~300)
Facilities missing coordinates?
Note: Start with sparse states to avoid overwhelming the map!
[ ] Assign lat/long to facilities missing coordinates where possible (~700). This can be done by:
Matching the facility to its corresponding HIFLD (leverage linkage volunteer work)
Adding the facility's address to programmatically geocode the facility
Manually adding the facility's lat/long
Adding this many new geocoded facilities will overwhelm the national website map - we'll need to work with Hyperobjekt to find a good solution!
Spellings missing info sheet matches?
[ ] Delete and/or find matches for facilities in fac_spellings without matches in fac_data (~1200)
Some can be resolved by assigning the correct jurisdiction to the facility in the info sheet (i.e. replacing missing NA jurisdictions with federal/state/county so the merging process can find these matches)
Some will require updating the facility_name_clean to match the clean name in the info sheet
Some will require adding an altogether new facility to the info sheet
Facilities missing other info?
[ ] Populate other fields (e.g. geographic info, facility info, pop/capacity source, etc.) where possible.
Closing this since a lot of it has been done already or we've changed how we want to handle missing fields (e.g. jurisdiction). I'll make new (and smaller) cards for the remaining outstanding pieces instead!
Note: These are all manual or semi-manual tasks to fill in gaps in the existing info sheets. In some cases, this can be facilitated by the new facility crosswalk updating functions, but those are intended to be a process for adding new spellings/facilities, rather than updating the existing sheets.
Make sure only one person is making updates at a time and that everyone is working off of the latest and greatest sheets for version control!
Priority: Start with facilities that are in the latest scraped data (since these show up on the website). Eventually, we'll also want to do this with the historical data.
fac_data
and just need missing data populatedFacilities missing states?
Facilities missing jurisdiction?
Facilities missing coordinates?
Note: Start with sparse states to avoid overwhelming the map!
[ ] Assign lat/long to facilities missing coordinates where possible (~700). This can be done by:
Matching the facility to its corresponding HIFLD (leverage linkage volunteer work)
Adding the facility's address to programmatically geocode the facility
Manually adding the facility's lat/long
Adding this many new geocoded facilities will overwhelm the national website map - we'll need to work with Hyperobjekt to find a good solution!
Spellings missing info sheet matches?
fac_spellings
without matches infac_data
(~1200)facility_name_clean
to match the clean name in the info sheetFacilities missing other info?