Added code to filter out military mail locations 'AA', 'AE' and 'AP'
Deleted 'International non-PR' and 'Everywhere Else' from the list of URLs to be fetched
Added code to classify services according to the ShelterApp categories (e.g. FOOD, HEALTH, etc.)
Added functionality to check for exact zip code match and fuzzy name match before inserting
To accomplish that last bullet, I've added functions that create a string of space delimited ngrams which, when established as a text index, allow for something like partial matching. If the top search result's name is a 90% match or higher with the proposed new service's name, then that service is dropped and not added to the database. I took a lot of inspiration for this approach from this article.
Relevant issue : Issue 5
In this pull request, I have:
To accomplish that last bullet, I've added functions that create a string of space delimited ngrams which, when established as a text index, allow for something like partial matching. If the top search result's name is a 90% match or higher with the proposed new service's name, then that service is dropped and not added to the database. I took a lot of inspiration for this approach from this article.