Islandora / documentation

Contains islandora's documentation and main issue queue.
MIT License
104 stars 71 forks source link

"no relation" option needed for typed_relation field #1770

Closed kspurgin closed 3 years ago

kspurgin commented 3 years ago

This was discussed on Slack starting here and in the 2021-03-03 Islandora Tech Call.

Real life metadata often has a mix of names with and without explicitly coded relationship/relator terms.

Currently the typed_relation field requires that a rel_type be selected/populated. ('Abridger' is first in the list as configured out of the box, so gets selected as a default, which will cause data quality issues when metadata is being created)

I think there are two parts to this issue:

Part 1 - Need to be able to not specify a rel_type value

The only "correct" rel_type to record for names/terms without associated rel_type would be something like:

(I'll digress in a comment on why this is the only correct value)

From a user-manually-entering-metadata or person-responsible-for-migrating-legacy-metadata-into-I8 perspective, I do not want to have to find/select/enter such a rel_type as an extra step.

From a user-viewing-metadata-display perspective, I don't want to see this, as it adds no information, may be confusing, and clutters the display.

Solution? Remove the isEmpty function preventing empty values from saving in a typed_relation field configuration. Add, at the top of the rel_type list (so it acts as default value) a blank value.

Part 2 - How a typed_relation field with no specified rel_type gets mapped to JSON-LD

I think the specifics of this are dependent on the assumed use cases for the resulting JSON-LD, and I'm not 100% clear on what we assume those are.

On a past project dealing with a similar issue, we split all names in catalog records into categories: creators, contributors, donors, owners, publishers/etc, and other. If a name was not in a field specifically reserved for recording creators, and did not specify an explicit relationship type, we treated it "under the hood" as though it were in the contributor category.

For most purposes, I think similar treatment is probably fine in the JSON-LD (and OAI-PMH, etc): if no rel_type value specified, output with predicate https://id.loc.gov/vocabulary/relators/ctb.html

Disclaimer: My assumption in saying this is that the JSON-LD expression is not expected to be the canonical "what's in the system" version of the metadata. The canonical "what's in the system" version should express the fact that I didn't assign a relationship type to the value.

I think this should be transparent and easily configurable per field, though, as needs for this may differ.

kspurgin commented 3 years ago

Digression for part 1:

Subbing in/forcing use of "contributor" when there is no rel_type pollutes the metadata.

Lack of a rel_type can mean many things:

Forcing the recording of any value across the board in this case will introduce inaccuracy in the metadata, that can't be identified/fixed because it very confidently has a relator term aside.

Assigning no rel_type, or one that specifically means "no rel type" retains the ambiguity in a way that will support future data remediation, or at least follows "tell the truth or don't say anything"

kspurgin commented 3 years ago

@seth-shaw-unlv @elizoller Tried to capture this

I will cast about for whether there's a better predicate for Part 2.

seth-shaw-unlv commented 3 years ago

JSON-LD is important because it is the basis for the object's description in Fedora, presumably the "preservation copy" of an object's descriptive metadata. If, for instance, you completely lose your Drupal and only have your Fedora remaining, anything that isn't included in the JSON-LD would be lost. Also, because Fedora is now a linked data platform, you need to have a predicate of some sort to relate an object and an agent.

Now that I think about it, MARC may give us something better than "contributor" as a fall-back:

Associated name [asn]

A person or organization associated with or found in an item or collection, which cannot be determined to be that of a Former owner [fmo] or other designated relationship indicative of provenance

which looks to me like MARC shrugging its shoulders as to how they are related.

kspurgin commented 3 years ago

Associated name [asn] is what I'd arrived at as the best value, too. Just hadn't had a chance to get back over here and say that.

I understand the need for the predicate in the linked data under the hood. My preference would still be to have a "blank in Drupal display, but maps to asn in the under-the-hood linked data" option. However, I feel slightly better about displaying "Associated name" than I would something like "No relationship specified," if it comes to that.

Given that I8 can be run without Fedora at all, I guess what people consider the canonical version of their metadata will vary.

seth-shaw-unlv commented 3 years ago

The easiest way to do that would be to add as the first possible relator in the field's configuration as:

relators:asn|
relators:abr|Abridger (abr)
relators:act|Actor (act)

This will display as:

Screen Shot 2021-03-05 at 8 06 23 AM

The JSON-LD would be:

{
  "@graph": [
    {
      "@id": "http://future.islandora.ca/node/42",
      "@type": [
        "http://pcdm.org/models#Object",
        "https://schema.org/DigitalDocument"
      ],
      "http://schema.org/author": [
        {
          "@id": "http://future.islandora.ca/user/3"
        }
      ],
      "http://purl.org/dc/terms/title": [
        {
          "@value": "TNT Product Warnings",
          "@language": "en"
        }
      ],
      "http://schema.org/dateCreated": [
        {
          "@value": "2021-03-05T16:04:06+00:00",
          "@type": "http://www.w3.org/2001/XMLSchema#dateTime"
        }
      ],
      "http://schema.org/dateModified": [
        {
          "@value": "2021-03-05T16:05:51+00:00",
          "@type": "http://www.w3.org/2001/XMLSchema#dateTime"
        }
      ],
      "http://purl.org/dc/terms/extent": [
        {
          "@value": "1 item",
          "@type": "http://www.w3.org/2001/XMLSchema#string"
        }
      ],
      "http://id.loc.gov/vocabulary/relators/asn": [
        {
          "@id": "http://future.islandora.ca/taxonomy/term/33?_format=jsonld"
        }
      ],
      "http://schema.org/sameAs": [
        {
          "@id": "http://future.islandora.ca/node/42"
        }
      ]
    },
    {
      "@id": "http://future.islandora.ca/user/3",
      "@type": [
        "http://schema.org/Person"
      ]
    }
  ]
}

We could update the formatter to omit the colon separator if the rel_type label is blank to clean it up a bit.

seth-shaw-unlv commented 3 years ago

Just to confirm for the Sprint, @kspurgin, are we okay with the suggestion I made above to simply update the field configs in islandora_defaults to make a blank relator mapped to "Associated name" and clean up the field formatter in controlled_access_terms to remove the colon when the relator label is blank?

kspurgin commented 3 years ago

@seth-shaw-unlv - as far as I understand how this all works, yes.

The one other thing I'm hearing about this field from MIG is that the relator code should not be displayed, as that's meant for machine use only. Separate issue, but mentioning it in case it's easier just to fix it while you are in there

seth-shaw-unlv commented 3 years ago

We can take them out. I don't believe I had them there initially. I think adding them was a request from @rosiel, but my memory may be off there.

rosiel commented 3 years ago

Might have been me, though I'm now of the opinion that displaying the code to the public is a mistake.

seth-shaw-unlv commented 3 years ago

Update from Slack, @kspurgin will update the field config to include the blank option and remove the parentheses. I'll do the formatter update (probably tomorrow).

elizoller commented 3 years ago

field config changes merged with https://github.com/Islandora/islandora_defaults/pull/50

elizoller commented 3 years ago

formatters changes merged with https://github.com/Islandora/controlled_access_terms/pull/64 @seth-shaw-unlv and @kspurgin can this one be closed?

seth-shaw-unlv commented 3 years ago

👍