RockefellerArchiveCenter / rac-data-model

Working repo for Project Electron data model.
2 stars 0 forks source link

Consider extending the ancestors and children objects in schema #22

Closed p-galligan closed 5 years ago

p-galligan commented 5 years ago

Do we want to extend these to show different types of relationships between related collections?

helrond commented 5 years ago

I think there are probably two parts to this question:

  1. What is the scope of the data inside the ancestors and children arrays?
  2. What does each object in that array look like?

For the first question, I'd like to propose the following:

  1. Data in the children array should be the direct children of that collection only.
  2. Data in the ancestors array should be the all of the ancestor components leading back to the ancestor collection without any parents.

So, for a collection hierarchy like this:

Collection A
  Record group B
    Subgroup C
      File D
      File E
      File F

The ancestors key for Subgroup C would look something like:

`"ancestors": [{Record group B},{Collection A}]

While the children key would be:

`"children":[{File D}, {File E}, {File F}]

In both cases, the order is significant. We may want to consider inverting the order of ancestors so it's top-down rather than bottom up. And, as this issue indicates, we may also want to explicitly add a data element which articulates that order.

For the second question, I think we'll need to define what this data looks like both pre and post indexing, since we'll probably want to do some denormalization of data during indexing. My thought right now is that each object in the ancestor and children arrays pre-index looks like:

{
    "external_identifiers":  [
        {
            "source": "archivesspace",
            "identifier": "/archival_objects/123"
        }
    ]
}

All of this data can be derived from the data source. Then, once the object has been indexed (or perhaps more accurately, when the object identified in ArchivesSpace by /archival_objects/123 has been indexed), that object in the ancestors or children array would look like:

{
    "title": "Correspondence, 1925-1930",
        "uri": "collections/23rq4df3asdfaf"
        "external_identifiers":  [
        {
            "source": "archivesspace",
            "identifier": "/archival_objects/123"
        }
    ]
}

Thoughts/counterpoints welcome!