livgust / covid-vaccine-scrapers

Open-source project using Nodejs and Puppeteer to scrape websites for COVID vaccine availability in Massachusetts. Can be modified to suit other areas and needs.
MIT License
66 stars 33 forks source link

CVS is overoptimisic #116

Closed johnhawkinson closed 3 years ago

johnhawkinson commented 3 years ago

As we kind of knew, the CVS front-end JSON availability is overoptimistic, and sometimes stale.

I saw that this morning, with, e.g., Cambridge availbility reported but not actually present.

This suggests it might be worth the effort to probe more deeply into attempting to actually book an appointment to get site-specific, time-specific, and numeric availability.

I am Not Excited about working on this and anyone should Feel Free To Pick It Up. (I am also not so sure how necessary it really is.)

jhalexand commented 3 years ago

I can't be certain, but I have at least anecdotal evidence that CVS might have listed all the sites as available this morning (4am) that were going to release appointments today. I was running my own little crawler that sent push notifications to my phone just for CVS using the same .json file that I see the scraper here is using and while there are definitely issues with timeliness, I will say that around 6:15am the CVSs in Malden did open up both locations with full availability. It looked like Chelsea and I think East Boston were also opened at the same time. Those filled up quickly and Lynn opened shortly thereafter. At about 6:35am the json was updated and of the 31 sites that were initially listed with availability, only 10 were left. Shortly after that, I stopped my crawler since we had gotten an appointment for my wife (a high school teacher) so I'm not sure how quickly the others fell off the list.

All that is to say that they may well have had appointments at all those locations and the release may have been staggered starting around 6. I'm not sure it gets around having to check for actual availability, but anecdotally it shows that perhaps the info was correct and just premature. It didn't take them too long (given the every 15 minutes update interval) to trim down the list of available sites after they got booked up.

You do have to get redirected to the actual questionnaire from the waiting area then fill out several pages of questions before you see availability, and the search is based on zipcode or city/state, so it would require iterating over each of the locations in the .json file and submitting the form for each. Doesn't sound fun.

johnhawkinson commented 3 years ago

You do have to get redirected to the actual questionnaire from the waiting area then fill out several pages of questions before you see availability, and the search is based on zipcode or city/state, so it would require iterating over each of the locations in the .json file and submitting the form for each. Doesn't sound fun.

For what it's worth, that's what most of the other scrapers do. When you get down to it it's just a for (;;) loop.

Anyhow, I remain unconvinced that this problem necessarily needs solving (it's "close enough"), and it sounds like you're in roughly the same place. Love to hear stronger opinions on either side.

livgust commented 3 years ago

Thanks for chiming in @jhalexand! I agree, the JSON file is not ideal. We'll have to see what straw breaks the camel's back here, I think...

iann commented 3 years ago

Has anyone explored trying to get a CVS API key (https://developer.cvshealth.com)? I just put my name in and it's awaiting approval. This tile on their developer page looks promising.

CVS developer page