CAVaccineInventory / vial

The Django application powering calltheshots.us
https://vial.calltheshots.us
MIT License
13 stars 1 forks source link

Use "inventory" key from source location instead of using "source->data->inventory" #578

Open simonw opened 3 years ago

simonw commented 3 years ago

The nasty hack that caused error #577 among other things works by looking in source->data->inventory in the source_location table.

Our ingestion team have actually already normalized this data for us, into the root level inventory key!

Some examples: https://vial.calltheshots.us/dashboard/?sql=select+*+from+source_location+where+import_json->'inventory'+is+not+null+limit+10%3Afiyx7WqNr1QZ4IqoD4duMZ-JERkXTBrHYVU7HWKi9_M

  "inventory": [
    {
      "vaccine": "moderna",
      "supply_level": "out_of_stock"
    },
    {
      "vaccine": "pfizer_biontech",
      "supply_level": "out_of_stock"
    },
    {
      "vaccine": "johnson_johnson_janssen",
      "supply_level": "out_of_stock"
    }
  ],
  "availability": {
    "drop_in": false
  },
  "opening_hours": []
}

We should switch our code to read this instead, which will also help simplify our nasty hack.

simonw commented 3 years ago

This is the code we can improve with this: https://github.com/CAVaccineInventory/vial/blob/da087de81719f6e0dd00d6e8979b5b36e5e9c2da/vaccinate/core/expansions.py#L18-L67

alexmv commented 3 years ago

This is also currently the longest database query we're doing right now: https://ui.honeycomb.io/vaccinateca/datasets/vial-production/result/wWzfoc2q8Zk

We can make this indexed, even via:

EXPLAIN
SELECT
    location.public_id,
    source_location.import_json @> '{"inventory": [{"vaccine": "moderna", "supply_level": "in_stock"}]}' AS "moderna",
    source_location.import_json @> '{"inventory": [{"vaccine": "johnson_johnson_janssen", "supply_level": "in_stock"}]}' AS "jj",
    source_location.import_json @> '{"inventory": [{"vaccine": "pfizer_biontech", "supply_level": "in_stock"}]}' AS "pfizer"
FROM
    source_location
    JOIN LOCATION ON location.id = source_location.matched_location_id
WHERE
    source_location.import_json @> '{"inventory": [{"vaccine": "moderna", "supply_level": "in_stock"}]}'
    OR source_location.import_json @> '{"inventory": [{"vaccine": "johnson_johnson_janssen", "supply_level": "in_stock"}]}'
    OR source_location.import_json @> '{"inventory": [{"vaccine": "pfizer_biontech", "supply_level": "in_stock"}]}'

...and making a jsonb index on it; something like:

create index concurrently some_good_name on source_location using gin(import_json jsonb_path_ops);
simonw commented 3 years ago

Really like that idea!