openreferral / specification

The Human Services Data Specification - a data exchange format developed by the Open Referral Initiative
https://openreferral.org
Other
117 stars 49 forks source link

Documenting behaviour for tables with multiple foreign keys #121

Closed timgdavies closed 7 years ago

timgdavies commented 7 years ago

A number of HSDS tables have multiple foreign keys.

For example, Holiday Schedule can be linked to service_id or location_id.

We need to document clearly whether consuming applications should ever expect more than one foreign key to be filled in, and how they should handle this if so.

My understanding would be that if you get more than one foreign key, the row in the table applies to the conjunction (AND) of the given foreign keys only.

For example, a holiday schedule that row records opening times for Monday with service_id of '1' and location_id of '25' applies to:

But it does not apply to:

nor to:

Does this fit with how others would interpret this?

timgdavies commented 7 years ago

I've been looking at this more - and suggest we move it to 1.1, as it might call for some wider field changes to make things entirely consistent across the spec.

E.g. #124 highlights that we have a couple of different ways in use right now of specifying location-specific contact information.

NeilMcKLogic commented 7 years ago

Tim, I agree with your initial "conjunction" interpretation but that is pretty subtle and might be missed by more junior implementers. Agreed we should explore better clarity in 1.1.

timgdavies commented 7 years ago

We discussed this on today's Open Referral Assembly Call, where I outlined some of the different sub-tables that currently have foreign keys against more than one major entity (organisation, service, location):

image

For example, the phone table can have a location_id, service_id, organization_id and contact_id. There was a further proposal that it should also have a service_at_location_id.

I thought it might be useful to construct some examples to test any approach we take.

Data supply side

Let's imagine an organization (o1) that provides a housing advice service (s1).

The service operates at three locations (l1, l2, l3).

There are a number of possible phone numbers here:

The first three of these fairly simply live in the phone table as so (with description here rather than dept for clarity and in-line with the proposal in #126):

id organization_id service_id location_id number description
a o1 111 111 111 Head Office
b s1 222 222 222 Service manager
c l1 333 333 333 Location 1 reception

However, when we come to the number for the team at each location, we either (a) need the link to service_at_location; or we use multiple foreign keys as:

id organization_id service_id location_id number description
d s1 l1 444 444 444 Location 1 : Housing Advice Service
e s1 l2 555 555 555 Location 2 : Housing Advice Service
f s1 l3 666 666 666 Location 3 : Housing Advice Service

or

id service_at_location_id ... number description
d sal1 444 444 444 Location 1 : Housing Advice Service
e sal2 555 555 555 Location 2 : Housing Advice Service
f sal3 666 666 666 Location 3 : Housing Advice Service

Data consumption

User story

Let's then imagine a user, Bob.

Bob wants to find the phone number of his nearest Housing Advice Service.

He tries calling it, but it goes to answerphone, and announces that the service is not operating this week.

He thinks he should check when they will be open again, so decides to phone the service manager, and looks for that number.

Technical implications

Multiple foreign keys

In the first case (multiple foreign keys), the query for Bob's first case would be along the lines of:

SELECT * from phones where service_id = 's1' AND location_id = 'l1'

But in his second case, we can't just use:

SELECT * from phones where service_id = 's1'

as we are going to get all the service_at_location entries. So we would need to use:

SELECT * from phones where service_id = 's1' AND location_id = ''

And in fact, as locations could have organization_id and contact_id we might also need to do something with those.

Further - I've not considered here the fall-back cases (e.g. you might want to give the phone numbers with a service_id when nothing with service_id and location_id is present)

Service_at_location linkage

ToDo: there would be a different query strategy here

Reflections

Unlike some of the other standards I've worked on - where different applications can reasonably make quite different decisions about what to display to users - it feels pretty important that a data publisher can know how a consumer will render their information in Open Referral - as otherwise service users could be quite misled. Although I might be overthinking...

timgdavies commented 7 years ago

Ok. Looping back to this and following discussions with @ekoner a few weeks back:

I'm testing out adding a foreign key for service_at_location to phone, contact and schedule.

This would allow an 'or' rather than 'and' approach to multiple foreign keys, whilst retaining the ability to 'over-ride' the phone number, contact person or schedule for a service at a specific location.

This would mean that:

In this model, if a service varied in other ways at particular locations (e.g. different eligibility etc, then that would need a duplicate service entry to be created with those different details).

Worked up - this does feel easier to explain than the multiple foreign key logic above...

ekoner commented 7 years ago

@timgdavies that makes sense and aligns with our discussion on balancing flexibility for the publisher with clarity for the user. It would be prudent to test this out with some scenarios and real data.