FamilySearch / gedcomx

An open data model and an open serialization format for exchanging genealogical data.
http://www.gedcomx.org
Apache License 2.0
355 stars 67 forks source link

specify how record collections reference the records they contain #67

Closed stoicflame closed 13 years ago

stoicflame commented 13 years ago

We need to define how record collections specify their contents.

The simplest solution would be to just have the collection contain a list of links to their records:

<collection>
  <title>...</title>
  ...
  <record rdf:resource="..."/>
  <record rdf:resource="..."/>
  <record rdf:resource="..."/>
  ...
</collection>

But is that adequate? What if there are millions of records in the collection? Does the record list need to be split up? If the answer is "no, we'll just deal with big collections", then we can just go with this suggestion. It's definitely the most clear.

Another option, just mentioned here for the sake of discussion, is to do it kinda like RDF specifies with the rdf:Collection construct, which constructs itself in a linked-list kind of a way. So it would look kinda like:

<collection>
  <title>...</title>
  ...
  <records rdf:resource="#myrecordlist"/>
  ...
</collection>

<rdf:Description rdf:nodeID="myrecordlist"> 
   <rdf:first rdf:resource="..."/>
   <rdf:rest rdf:nodeID="#myrecordlist2"/> 
</rdf:Description>

<rdf:Description rdf:nodeID="myrecordlist2"> 
   <rdf:first rdf:resource="..."/>
   <rdf:rest rdf:nodeID="#myrecordlist3"/> 
</rdf:Description>

<rdf:Description rdf:nodeID="myrecordlist3"> 
   <rdf:first rdf:resource="..."/>
   <rdf:rest rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#nil"/> 
</rdf:Description>

Thoughts and comments are welcome.

carpentermp commented 13 years ago

I don't really understand the 2nd option very well so all I can say about it is that it seems more verbose, so option 1 appeals more to me.

I suppose that for data exchange having the full list is fine, but not for a web service. A web service would need a way to page through the contents. Also, remember that record order is significant and that, in a web service, we need a way of efficiently moving from a given Record to the previous and next Record.

A couple of more things--in data exchange, a simple link to the contained resource is all you need, but in a web service you would probably want a little metadata, like:

This metadata helps a "browser" of this data know something about the contained thing without necessarily having to fetch it to find out about it. We used "Container" for all types of things, including waypoints.

I suppose you could accomplish the need for content metadata by including, in the fetch of a Collection, summarized metadata descriptions for the contained resources.

stoicflame commented 13 years ago

Very good points.

I don't really understand the 2nd option very well so all I can say about it is that it seems more verbose, so option 1 appeals more to me.

It's just a "linked list" implementation where resources in the list contain the resource and a pointer (URI) to the "next" resource. The last resource "points to" null.

But I agree. Yuck.