usdigitalresponse / univaf

An API hosted by USDR for recording and querying vaccine appointment availability.
https://getmyvax.org/
Apache License 2.0
7 stars 2 forks source link

Deal with Albertsons switching to different booking API #1460

Closed Mr0grog closed 1 year ago

Mr0grog commented 1 year ago

(Noticed this while looking into #1459.)

Sometime in the last 7 months (last time I dealt with Albertsons’s systems was in August), Albertsons moved all their COVID vaccine booking to their normal pharmacy scheduling system instead of the MHealth system they were originally using and that our system used. Store data is still in the MHealth, but it’s now stale.

The new booking site is at https://www.albertsons.com/vaccinations/home. Data is loaded by POSTing JSON to https://rxie.albertsons.com/abs/prod/rxie/appointment/service-availability. It searches based on zip code, so will more like our other scrapers to use, and much more intense to run. OTOH, the response data is formatted much better, and does not have all the nutty parsing issues we dealt with from MHealth.

We should probably disable the old MHealth source.

A new scraper could to be written, though we may consider only running it for NJ and AK. (Alternatively, we can just stop pulling data from Albertsons, and rely on the not-actually-appointments stock data from CDC. That would be a bit disappointing, though.)

Mr0grog commented 1 year ago

Example POST payload:

{
    "radius": 15,
    "dateOfBirth": "04/20/1980",
    "serviceTypes": [
        {
            "serviceTypeName": "COVID-19 Pfizer",
            "scientificName": ""
        }
    ],
    "zip": "60190"
}

Whole request as cURL:

curl 'https://rxie.albertsons.com/abs/prod/rxie/appointment/service-availability' \
-X POST \
-H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/113.0' \
-H 'Accept: application/json, text/plain, */*' \
-H 'Accept-Language: en-US,en;q=0.5' \
-H 'Accept-Encoding: gzip, deflate, br' \
-H 'correlationId: 7e0e336a-3dc8-45fe-a6c2-280cab92f349' \
-H 'version: 2' \
-H 'ocp-apim-subscription-key: b7bda12ac31a47b0a2bb3cad633d60c4' \
-H 'Content-Type: application/json' \
-H 'Origin: https://www.albertsons.com' \
-H 'DNT: 1' \
-H 'Connection: keep-alive' \
-H 'Referer: https://www.albertsons.com/' \
-H 'Sec-Fetch-Dest: empty' \
-H 'Sec-Fetch-Mode: cors' \
-H 'Sec-Fetch-Site: same-site' \
-H 'Pragma: no-cache' \
-H 'Cache-Control: no-cache' \
-H 'TE: trailers' \
--data-raw '{"radius":15,"dateOfBirth":"04/20/1980","serviceTypes":[{"serviceTypeName":"COVID-19 Pfizer","scientificName":""}],"zip":"60190"}'

Responses are like:

[
    {
        "store": {
            "storeNumber": "3098",
            "currentStoreNumber": "3098",
            "storeName": "JEWEL-OSCO",
            "storeDivision": "3b329a37-459b-eb11-b1ac-0022480a8bdf",
            "streetAddress": "12003 S PULASKI ROAD",
            "streetAddress2": null,
            "city": "Alsip",
            "state": "IL",
            "zip": "60803",
            "longitude": -87.71785,
            "latitude": 41.6741,
            "distanceFromSearchPointMiles": 26.287,
            "telephone": "(708) 389-2725",
            "fax": "(708) 389-2786",
            "imageUrl": "https://images.albertsons-media.com/is/image/ABS/jewelosco_logo",
            "mfHours": "8AM - 8PM",
            "satHours": "9AM - 5PM",
            "sunHours": "9AM - 5PM",
            "twentyfourHours": "false",
            "availableAppointments": null,
            "websiteUrl": "https://www.jewelosco.com/pharmacy.html",
            "vcClientId": null
        },
        "serviceTypes": [
            {
                "serviceTypeName": "COVID-19 Additional Dose Pfizer Child"
            }
        ]
    },
    // ...etc...
]

Need to…

Mr0grog commented 1 year ago

Looks like you can get a list of services from a POST to https://rxie.albertsons.com/abs/prod/rxie/appointment/services. It returns data like:

{
    "services": [
        {
            "type": "Vaccine",
            "name": "COVID-19 Booster Dose Janssen",
            "label": "COVID-19",
            "scientificname": "",
            "epsDisplayName": "COVID-19 Booster Dose Janssen",
            "eligible": true,
            "description": "Minimum required age for vaccine is 12 years",
            "restrictionText": "",
            "additionalText": "Ages 3+",
            "sortIndex": 1,
            "category": "COVID-19"
        },
        // ...etc....
    ],
    "correlationId": "abc123"
}

The name field appears to match the serviceTypeName in other requests, and it looks like we can filter COVID vaccines based on label == "COVID-19" (COVID non-vaccine stuff seems to have other labels).

Here’s a full output I’m getting right now: https://gist.github.com/Mr0grog/f1cc7f89e3b27dc1ad567b0f23c14692

cURL example:

curl 'https://rxie.albertsons.com/abs/prod/rxie/appointment/services' \
  -X POST \
  -H 'Accept: application/json' \
  -H 'Accept-Encoding: gzip, deflate, br' \
  -H 'correlationId: 7e0e336a-3dc8-45fe-a6c2-280cab92f349' \
  -H 'version: 2' \
  -H 'ocp-apim-subscription-key: b7bda12ac31a47b0a2bb3cad633d60c4' \
  -H 'Content-Type: application/json' \
  --data-raw '{"dateOfBirth":"04/20/1980","sendOnlyEligible":true}' \
  --compressed

Not sure if the correlationId header is some kind of session token or a more static API key sort of thing, or what, but it is required.

Mr0grog commented 1 year ago

First vaccine serviceTypeName:

Second dose does not include previous vaccination date in the payload. serviceTypeNames:

If I set a young age:

Mr0grog commented 1 year ago

It appears that if you specify multiple serviceTypeNames, you get locations that support all of those types, not locations that support any. So we need to make multiple queries per zip code to get a full accounting.

Mr0grog commented 1 year ago

I thought, cleverly, that if I used COVID-19 Unknown as the service type name, I might get a superset, but it returns location records where the available service types are just COVID-19 Unknown.

I’m not even clear whether I’m getting back locations that do any COVID vaccine, or something else entirely. If the former we could do that to get a locations where some kind of COVID vaccine is available, and skip product info, but that’s not great. :(

It seems like you also have to provide a single specific service type in order to list slots, although slots get listed location by location, and probably wouldn't be worth requesting in the first place (too much overhead).

Mr0grog commented 1 year ago

Finally, the availableAppointments field always seems to be null, BUT it appears that a location is only returned if there are available appointments. So we can use listings as a YES signal, then find all other Albertsons locations (since we have a complete list) and mark them as NO.

Mr0grog commented 1 year ago

Search radius can go at least to 76 miles, but it does get capped at some level. It's unclear what number that is, so maybe best to stick with 50, which is the max in the UI.