livgust / covid-vaccine-scrapers

Open-source project using Nodejs and Puppeteer to scrape websites for COVID vaccine availability in Massachusetts. Can be modified to suit other areas and needs.
MIT License
66 stars 33 forks source link

Trinity EMS scheduling is now private; scraper needs updating #177

Open pfurbacher opened 3 years ago

pfurbacher commented 3 years ago

I'd appreciate it if those who read this issue try the Trinity EMS URL(https://app.acuityscheduling.com/schedule.php?owner=21713854&calendarID=5109380) to corroborate what I'm seeing (image included below).

It appears that Trinity EMS has changed their approach to scheduling: all appointments are now indicated as private and not available for scheduling. With this change, the selectors (e.g., used to wait for the page to load, get month count), have disappeared. As a consequence, the scraper times out waiting for the selector for the month chooser. At least this one problem should be fixed.

Beyond this, it might be worthwhile rethinking whether this site needs to be scraped. After all, it is several months into the vaccination drive, and the site has never given any indication that there would be availability.

TrinityEMS-private-appointments-only
harcod commented 3 years ago

Yes, I'm seeing the same "private" message above.

pfurbacher commented 3 years ago

Thanks for the confirmation.

pfurbacher commented 3 years ago

As I rework this scraper, I'm wondering if it's valuable to add as a restriction in the result object returned from the scraper that appointments are private, as in the following? (In the reworked code, I'm actually scraping this message text. Just wondering if I should insert it into the results object.)

{
    "version": 1,
    "debug": {
        "clientIpAddress": "108.26.228.132"
    },
    "timestamp": "2021-03-24T014058Z",
    "results": [
        {
            "name": "Trinity EMS (DiBurros Function Facility)",
            "street": "887 Boston Rd",
            "city": "Haverhill",
            "zip": "01835",
            "website": "https://app.acuityscheduling.com/schedule.php?owner=21713854&calendarID=5109380",
            "restrictions": "All appointment types are private, none are available for scheduling.",
            "latitude": 42.753818,
            "longitude": -71.104598
        }
    ]
}
livgust commented 3 years ago

Hey Paul! 3 things. 1) We should always return hasAvailability: true/false if we have a successful scrape. 2) 'restrictions' is meant for locations that only allow certain residents, so that's not the place to put that information. Perhaps you could put it in extraData? 3) I think we should still scrape this site because https://trinityems.com/what-we-do/covid-19-vaccine-clinics/ insists that they will only be adding appointments through their website, which still links to the acuityscheduling page. So perhaps this is temporary.

Thoughts?

pfurbacher commented 3 years ago
  1. Not sure what happened to the availability and hasAvailability properties (sloppy finger taps on the trackpad while editing the out.json copy? I sometimes end up deleting/overwriting large swaths because of this!). See current (actual, not edited) out.json pasted below.
  2. extraData is it.
  3. If we keep it, the scraper needs to send notify us (via s3 and slack) should they change the private appointments only stance. I'll add that capability.
{
    "version": 1,
    "debug": {
        "clientIpAddress": "108.26.228.132"
    },
    "timestamp": "2021-03-24T142518Z",
    "results": [
        {
            "name": "Trinity EMS (DiBurros Function Facility)",
            "street": "887 Boston Rd",
            "city": "Haverhill",
            "zip": "01835",
            "website": "https://app.acuityscheduling.com/schedule.php?owner=21713854&calendarID=5109380",
            "availability": {},
            "hasAvailability": false,
            "extraData": "All appointment types are private, none are available for scheduling.",
            "latitude": 42.753818,
            "longitude": -71.104598
        }
    ]
}
livgust commented 3 years ago

178 closes this