openreferral / specification

The Human Services Data Specification - a data exchange format developed by the Open Referral Initiative
https://openreferral.org
Other
117 stars 49 forks source link

Incorporate a flat data model #128

Open pmackay opened 7 years ago

pmackay commented 7 years ago

Has there been consideration of including a flat, single-sheet data model in the core OR spec, similar to what was described in Introducing the Humanitarian Service Data Model?

timgdavies commented 7 years ago

There has been some discussion, but nothing concrete as far as I know.

Can you describe a use-case that would want this?

At the moment, the model is quite strongly relational, so the most obvious flat versions would either:

(a) lead to a lot of duplicated rows;

and/or

(b) only really be useful for very simple data (services with one location, one address, etc.)

pmackay commented 7 years ago

I agree that A or B are entirely possible. However there are times when the simplicity of adding/editing data on one sheet could be a worthwhile tradeoff against duplication, particularly if the data is quite simple.

It was needed by the author of that post. While doing various data management tasks on projects I'm finding there are times when single sheets are simpler than multiple. So just wondering if it would be useful to publish some docs on how best to structure single sheet variants, so their usage/adoption is consistent for OR? I'd be happy to help to do that if others feel its a handy addition.

klambacher commented 7 years ago

For consumers of our data using external systems, the number one request continues to be a single, flat tabular format of the data ("Give me a simple CSV!").

As a consumer of the information, I am less thrilled to have it in this format, because I don't like the large amount of extra work required to parse information out into our highly-relational system, and feel that this can often lead to losing data (or at least losing data resolution), unnecessary duplication, and loss of relationships. However, I do believe that having an official or quasi-official format for a single, flat table would be extremely beneficial. The demand for this single, flat format will not go away, and will continue to be requested - if there's a predictable format for it, some of these data aggregators can more easily consume data from multiple sources and we can at least agree on some of the features needed to make it easier to consume.

devinbalkind commented 7 years ago

It seems to me that it would be possible to create an inelegant CSV file with all the fields and then a "guidance" document that would explain how, step by step, that CSV could be converted into an OR complaint data package.

I'm also interested in a simplified OR data model that could be implemented using AirTable.com's fantastic spreadsheet-like application. It could be three CSV tables (organization, location, service) or four (adding contacts). This simplified model and AirTable app would make it significantly easier for humanitarian organizations, grassroots groups and people who aren't IR professionals to manage, display and share this data.

timgdavies commented 7 years ago

Thanks for the input all. It seems to me that some thought about a reasonably standardised CSV serialisation is definitely something that should be part of the next iteration of the standard.

This is something we've looked at quite a lot with 360 Giving which is defined by a canonical JSON structure, but which can be flattened according to some rules, and which allows 'designed' spreadsheet templates also.

Applying this model to OR might hit some issues with hierarchy, as OR is not hierarchical right now (i.e. not clear what the top-level object in OR is) - but would be a basis we could work from.

@pmackay If you were up for drafting some simple templates, or guidance on how people can create them with/from OR data, that would be fantastic to help move this forward...

timgdavies commented 7 years ago

This wasn't completed as part of 1.1. But it remains an important issue.

It might also follow well from the JSON work for the API specification - as we could potentially then use the 360 Giving Flattening approach.