smart-on-fhir / smart-scheduling-links

Clinical Appointment Slot Discovery
28 stars 15 forks source link

Using `contained` for inlining schedules #12

Closed Mr0grog closed 3 years ago

Mr0grog commented 3 years ago

For cases where a publisher is only providing schedules for a single service type, is it feasible or reasonable to cut down on the number of requests by inlining the schedules resources in the locations using contained? (It looks like the schedule type forbids inlining slots, but locations do not forbid inlining schedules.)

(In my case, I’m working with state governments considering how they should publish their data that combines information from various providers. They are only currently concerned with publishing COVID-19-related data, so the number of schedules per locations is pretty constrained — maybe just one for COVID-19 shots or maybe separate ones for first vs. second dose or different vaccine types. In this scenario, it might be simpler if schedules and locations could be combined.)

For example, I might have the manifest:

{
  "transactionTime": "2021-01-01T00:00:00Z",
  "request": "https://example.com/covid-vaccines/$bulk-publish",
  "output": [
    {
      "type": "Location",
      "url": "https://example.com/data/location_file_1.ndjson"
    },
    {
      "type": "Slot",
      "url": "https://example.com/data/slot_file_MA.ndjson"
    }
  ],
  "error": []
}

And the location file:

{
  "resourceType": "Location",
  "id": "123",
  "name": "Flynn's Pharmacy in Pittsfield, MA",
  "description": "Located behind old Berkshire Bank building",
  "telecom": [{
    "system": "phone",
    "value": "413-000-0000"
  }, {
    "system": "url",
    "value": "https://pharmacy.example.com"
  }],
  "address": {
    "line": ["173 Elm St"],
    "city": "Pittsfield",
    "state": "MA",
    "postalCode": "01201-7223"
  },
  "contained": [
    {
      "resourceType": "Schedule",
      "id": "456",
      "serviceType": [
        {
          "coding": [
            {
              "system": "http://terminology.hl7.org/CodeSystem/service-type",
              "code": "57",
              "display": "Immunization"
            },
            {
              "system": "http://fhir-registry.smarthealthit.org/CodeSystem/service-type",
              "code": "covid19-immunization",
              "display": "COVID-19 Immunization Appointment"
            }
          ]
        }
      ]
    }
  ]
}

And finally the slots file:

{
  "resourceType": "Slot",
  "id": "789",
  "schedule": {
    "reference": "Schedule/456"
  },
  "status": "free",
  "start": "2021-03-10T15:00:00-05",
  "end": "2021-03-10T15:20:00-05",
  "extension": [{
    "url": "http://fhir-registry.smarthealthit.org/StructureDefinition/booking-deep-link",
    "valueUrl": "https://ehr-portal.example.org/bookings?slot=opaque-slot-handle-89172489"
  }]
}

I think this is correct and valid by FHIR, but I’m relatively new to the standard. The SMART Scheduling Links standard here uses separate resource files in all the examples, but the text doesn’t seem to forbid the above. Not sure whether this is something that:

In general, it feels like there’s a lot of gray area like this between the simple path suggested by https://github.com/smart-on-fhir/smart-scheduling-links/blob/master/specification.md and the full implications of the objects in the spec being specific resources types that are defined more fully in the complete FHIR spec.

jmandel commented 3 years ago

I think this is correct and valid by FHIR, but I’m relatively new to the standard.

Thanks @Mr0grog! First off, let me say it's great that you're digging into the standard, and I know the learning curve can be steep. There are a couple of issues worth considering here:

  1. Using contained resources for data that can be stably identified and given a "proper" id violates FHIR's methodology; see the bold text in https://www.hl7.org/fhir/references.html#contained for background (basically: "contained" should only be used in cases where you don't know enough about a resource to manage it stably over time on its own; it should never be used for convenience of packaging -- for that, we have things like Bundle).
  2. Even if we wanted to ignore the advice in (1), the technical details of the examples above aren't invalid -- you can't point to Schedule/456 from a Slot if there is no top-level Schedule/456 resource (and in your example, there isn't; this is sitting contained in a Location). In theory you might be able to write something like schedule: {reference: "Location/123#456"}, but I don't think this kind of cross-resource-into-contained reference is technically valid (just asked here to clarify) -- and if it is, it's nevertheless an extremely unusual pattern that we'd do well to avoid for this reason.

Zooming out (and looking ahead) a bit, it seems likely that we'll be defining additional Schedule types as we go (e.g., to distinguish between calendars for different vaccine products, and 1st dose vs 2nd dose appointments, etc). (Edit to add: just saw you mentioned exactly this point in #13!) Still, Schedules and Locations will be fairly static. What are we trying to optimize for specifically? Total number of request shouldn't be a big deal if the vast majority of requests for Location and Schedule files return immediately with 304 Not Modified responses.

Mr0grog commented 3 years ago

More than trying to optimize anything, I’m trying to map out the edge cases I might encounter here, since in my work with states, I am consuming the output of a wide variety of systems we hope will conform to this standard just as much or more than I am publishing (I’m sort of working my way through potential cases by imagining the crazy things I might do as a publisher). FHIR is a reasonably complex standard, and I want to make sure we are at least doing a halfway decent job of anticipating and handling the various input we will potentially encounter across clinics and pharmacies.

In states’ capacity as publishers (if we republished combined data from various providers using this standard), we might interested in trying to reduce:

That said, I don’t necessarily think either of the above are a big deal. My more primary concern, again, was understanding what weird things might be conformant and that I might have to handle. Sounds like this isn’t something that would conform to FHIR in the first place, though. 😄

jmandel commented 3 years ago

(I’m sort of working my way through potential cases by imagining the crazy things I might do as a publisher)

Awesome, that's a super valuable exercise!

Re: overall load, it's a fair point about whether clients will be "nice enough" to provide ETag or If-Modified-Since headers; but in the general case, publishers also need to deal with denial of service attacks, so it's important to have approaches to rate limiting with any pull-based spec. I'm not sure hosting one locations file + one schedules file is very different from a single file with locations and schedules combined.

Re: complexity of parsing/loading the data, I agree this is important, but "what's simpler" is a bit of a matter of perspective here. Parsing a uniform file full of Schedules and a uniform file full of Locations makes populating a database from these files very straightforward. For example with type-specific files you can pipe lines directly into sqlite tables, and get a table full of Locations and a table full of Schedules and write SQL queries that join across them.

That said, I don’t necessarily think either of the above are a big deal.

Cool -- I agree these things are important to think about, and I'm very pleased you're digging in. I'm going to close this issue as "nothing to change" for now (but feel free to carry on discussion about these points here, and don't take this as a sign that I'm anything less than enthusiastic about input and ideas here!)