GUI / covid-vaccine-spotter

https://www.vaccinespotter.org
MIT License
508 stars 136 forks source link

Walmart store carries_vaccine == false has appointments #157

Closed jxchong closed 3 years ago

jxchong commented 3 years ago

FYI @GUI Walmart 2196 is listed as carries_vaccine == false but they have a bunch of open appointments this week. Not sure if the store refresh is failing again (#118 ) or if Walmart simply isn't reliably updating that flag?

GUI commented 3 years ago

@jxchong: Thanks so much for reporting this. This turned out to be a bit more subtle of a bug that affected a handful of locations, but it should be fixed now. I've also added some additional timestamp logging that should hopefully make this type of issue more obvious. I've verified that all locations in the US are now being updated (and found a few locations that were previously in the database, but have since completely shut down the stores). Sorry for the trouble, but thanks as always for bringing things like this up!

There were 2 issues going on that were preventing 40 locations (out of 4600) from having their store metadata refreshed on a regular basis:

  1. We search for stores based on a ~50 mile grid cell radius to match the 50 mile radius for the Walmart search API. However, since Walmart's search only accepts an address or zip code, and we're using the grid cell's centroid zip code, the zip code location may not exactly be in the center of the grid cell, so a 50 mile search radius may not actually cover the full grid cell.

    This was affecting ~20 isolated stores in a few random locations throughout the US.

    This is fixed by increasing the search radius to 100 miles, but still basing the searches on a 50 mile grid cell so that there's plenty of overlap to cover off-centered zip codes.

  2. Walmart's search API returns a maximum of 50 stores. If there are more than 50 stores in the specified search radius, then we weren't finding the other stores.

    This was affecting ~20 stores in the Dallas metro region.

    This is fixed by re-processing grid cells using a smaller grid if we encounter 50 results.

In order to improve the ability to monitor and detect these situations, I've also added a new timestamp field to keep track of when this store-level metadata is refreshed on locations, so it's easier to see whenever this is not happening for specific stores.