openreferral / specification

The Human Services Data Specification - a data exchange format developed by the Open Referral Initiative
https://openreferral.org
Other
117 stars 49 forks source link

Example datasets? #82

Closed pmackay closed 7 months ago

pmackay commented 9 years ago

Are there any example datasets (that match the 1.0 spec!) available on the web? Would be great to have those linked from http://openreferral.org/.

bengolder commented 9 years ago

Just wanted to add that I came here looking for an example and can't find one. I would also like to see an example somewhere prominent in the documentation.

fureigh commented 9 years ago

+1 for examples being useful.

niveditc commented 9 years ago

+1 for examples as it would be great for testing the validation tool that I'm creating (http://niveditc.github.io/open-referral-validation/).

greggish commented 9 years ago

I'm sorry it's taken so long to address this. Previous versions of the spec were posted with sample datapackages, and I'd assumed that would have been the case with v1.0, but somehow it didn't happen. I'm asking around to see which of the pilots has a good sample that they can share. Thanks for your patience.

bengolder commented 9 years ago

Just reflecting that there seems to be two levels of need here:

  1. Demonstrative, detailed data sets as part of the user documentation (aimed at existing users, for testing, or for those looking deeply into the spec).
  2. Small, simplified examples that show the basics of OpenReferral at a glance, aimed at curious passersby (such as myself).
monfresh commented 9 years ago

Example CSV files are available in the Ohana API repo here: https://github.com/codeforamerica/ohana-api/tree/master/data/sample-csv

The Ohana API Wiki also has instructions for creating the CSV files, including all the columns in each file, whether or not it is required, and a description of the field, including any special formatting. I think the documentation is easier to read when presented in this fashion, and I would recommend that the documentation in this repo be updated to match.

niveditc commented 9 years ago

@monfresh, thank you! This will be very helpful.

greggish commented 9 years ago

Thanks @monfresh. I know there has been work done on upgraded documentation for the spec, so I'll share your suggestion.

Just to clarify: among those CSVs, I believe we still don't have an example JSON datapackage to go with it. We have this datapackage.json template here but until we have a sample datapackage that corresponds with sample CSVs, this issue should remain open.

monfresh commented 9 years ago

Note that the sole purpose of the datapackage.json is to provide documentation about the CSV files, but since the spec is already documented, I personally consider the datapackage.json to be redundant. The presence of datapackage.json is in no way necessary in order to import the data into any system.

The GTFS, for example, has excellent online documentation, and does not make use of a datapackage.json.

If the set of CSV files were to be shared with someone who was not familiar with the spec, I think including a text file with a link to the spec documentation would suffice.

greggish commented 9 years ago

At first glance I was also wary of the addition of datapackage.json to our spec, as it appears to another layer of technical complexity that may fall outside of the grasp of many non-technical users who might produce or import HSDS data.

However, as I understand it, this is an emerging protocol for publishing structured datasets on the web, one that could remove some of the friction posed by the complexity of our spec. GTFS was designed before the creation of the datapackage protocol, and I've heard some suggest in retrospect that GTFS may have benefited from it. Furthermore, our spec is at this point significantly more elaborate than GTFS.

So, we should welcome opportunities to test these assumptions: More about datapackages here: http://dataprotocols.org/data-packages/ And a tool for creating them here: http://ckan.org/2014/06/09/the-open-knowledge-data-packager/

cc @monfresh

timgdavies commented 7 years ago

We have been working on an example dataset to put out with the further release of 1.0 tidied up docs.

We are now working with datapackage.json as our definition for the HSDS, which means we can generate nice views of packages like the example here and can get the benefits of data package tooling in future.

timgdavies commented 7 years ago

Moving to 1.1

NeilMcKLogic commented 7 years ago

I'd be willing to provide an example in the future but first want to incorporate all the 1.0 udpates Tim recently published to ensure it is compliant. I have provided a dataset privately to @timgdavies and am expecting the good sir will have some such feedback for me soon.

mrshll1001 commented 7 months ago

I'm closing this since as of 3.0 we have a dedicated examples/ folder on the repo, and examples are rendered for each object on the schema reference page in the docs. This isn't quite a complete example dataset, but I believe this addresses similar concerns and is more appropriate and manageable for the current docs.