ShelterApp / AddResources

http://shelterapp.org/
11 stars 10 forks source link

Various Minor Improvements #12

Closed hkuffel closed 3 years ago

hkuffel commented 3 years ago
  1. Created functionality to name multiple collections in the script: a check_collection to check for fuzzy name matches against, a dump_collection to dump the new IRS services in and a dupe_collection to dump any duplicates that are found when running.
  2. Added a step early on to delete any services whose EIN is already in the the database.
  3. Added a step to scrape the date of last update on the IRS website and compare to the last update date in our DB

I haven't resolved the case where services have very different names but the same address, as handling potential variability in address strings actually seems like a pretty big job.