LD4P / sinopia_editor

Sinopia Linked Data Editor
https://sinopia.io/
Apache License 2.0
35 stars 10 forks source link

Nested resources & blank nodes & lookups (PCC wish list no. 2) #2787

Closed NancyL closed 3 years ago

NancyL commented 3 years ago

Not sure whether to call this a bug or an enhancement request.

When we create a nested resource in a template, we create a blank node. This makes sense for many of these resources such as bf:Contribution or bf:Title, where we would not put a URI in the position of the blank node.

However, in some cases, we create a nested resource only in order to identify the class of the object property. This happens most often combined with a lookup, which, by definition, has an existing URI. In order to have a place for that URI, we have to put in an "rdf:value" as below with bf:agent:

      bf:contribution [a bf:Contribution ;
            bf:agent [a bf:Person ;
            **rdf:value URI ;** 
            rdfs:label "label" 
            ].
      ] .

While this is valid rdf, it is not good rdf, in that it adds in an extra blank node where one is not necessary. The rdf should look as follows:

       bf:contribution [a bf:Contribution ;
                bf:agent <URI for Agent>  .
                <URI for Agent> [a bf:Person ;
                      rdfs:label "label" 
                ].
        ] .

LC fixes this with some programming behind the scenes. They used to use bflc:target to indicate where to do it. They don't any more and I don't understand how they do it now.

This extra blank node is and will remain a major issue in interoperability given that other systems will not be looking for a blank node in that position, but instead a direct URI. I have no idea what the solution is. It is no. 2 on PCC's wish list; no. 1 on mine!

michelleif commented 3 years ago

can you say more about why the class needs to be specified for the object?

NancyL commented 3 years ago

Many of these have multiple object types to choose from and need to be specified. Not all vocabularies specify they are instances of a particular type (bf:Person is specified in NAF, but not VIAF, ISNI, Wikidata) and in sharing our data the expectation is to see this associations, at least in SVDE & with LC.

justinlittman commented 3 years ago

The request makes sense to me; I have no idea what it will take to implement.

michelleif commented 3 years ago

a couple follow-up questions: 1) are there other examples besides Agent Class in existing templates where you need to create additional statements about the object? 2) in the case of needing to say what class an Agent is...are there other ways to express that an Agent is an instance of a particular class? if we are using agent URIs from Wikidata, for example, could you add a statement in Wikidata: agent123 hasType bf:person?

michelleif commented 3 years ago

let's ask LC to show how they do this.

justinlittman commented 3 years ago

Do what?

michelleif commented 3 years ago

re the need to state what Agent class an Agent is: in the LC Editor's Contribution field, they display each subclass as a button:

Screen Shot 2021-03-23 at 8 27 28 PM

Cataloger clicks the button and then finds the agent they want to add. Here's an example of clicking Family and then adding the agent via lookup:

Screen Shot 2021-03-23 at 8 27 35 PM Screen Shot 2021-03-23 at 8 27 48 PM Screen Shot 2021-03-23 at 8 28 03 PM

the resulting relevant RDF is:

<http://id.loc.gov/authorities/names/no2021026293> a bf:Family;
    rdfs:label "Smith (Family : Robeson County, N.C.)".
<http://id.loc.gov/resources/works/e269516439478861060154699502576268107904> a bf:Work;
    bf:contribution _:b4_b2.
_:b4_b2 a bf:Contribution;
    bf:agent <http://id.loc.gov/authorities/names/no2021026293>.

by contrast doing a similar thing in Sinopia (stage)

Screen Shot 2021-03-24 at 8 38 40 AM

(btw couldn't find the same Smith Family in QA lookup so that's why the Agent is different)

you get this RDF

<> <http://sinopia.io/vocabulary/hasResourceTemplate> "pcc:bf2:Monograph:Work";
    a <http://id.loc.gov/ontologies/bibframe/Work>;
    <http://id.loc.gov/ontologies/bibframe/contribution> _:b3.
_:b3 a <http://id.loc.gov/ontologies/bibframe/Contribution>;
    <http://id.loc.gov/ontologies/bibframe/agent> _:b4.
_:b4 a <http://id.loc.gov/ontologies/bibframe/Family>;
    <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://id.loc.gov/authorities/names/no2005092812>.
<http://id.loc.gov/authorities/names/no2005092812> <http://www.w3.org/2000/01/rdf-schema#label> "Smith Family (Musical group)".

how does the LC Editor create this statement based on the fact that the user clicked the "Family" button?

in Sinopia, the Contribution template has a separate subtemplate for each type of Agent, which introduces an extra blank node (_:b4 in this case). are the Agent "buttons" in the LC Editor simply subtemplates? if so how are they not ending up with a blank node there?

justinlittman commented 3 years ago

If my understanding is correct, what we want is:

<> <http://sinopia.io/vocabulary/hasResourceTemplate> "pcc:bf2:Monograph:Work";
    a <http://id.loc.gov/ontologies/bibframe/Work>;
    <http://id.loc.gov/ontologies/bibframe/contribution> _:b3.
_:b3 a <http://id.loc.gov/ontologies/bibframe/Contribution>;
    <http://id.loc.gov/ontologies/bibframe/agent> <http://id.loc.gov/authorities/names/no2005092812>.
<http://id.loc.gov/authorities/names/no2005092812> a <http://id.loc.gov/ontologies/bibframe/Family>
   <http://www.w3.org/2000/01/rdf-schema#label> "Smith Family (Musical group)".

right?

sfolsom commented 3 years ago

Is the desire to add classes locally to external entities (e.g. bf:Agents) simply about supporting conversion back to MARC (100, 110, 111)? If so, rather than create local data for external datasets... another option could be that converters understand the external data models, and look to their native class types to decide whether something belongs in a particular MARC field.

NancyL commented 3 years ago

I think this may be getting a bit off track. There are several places in BF in which the object can be one of several BF classes. We have to name that class, either because: 1. there is no vocabulary with URIs, so a need to reify; 2. it is a specified range in BF; 3. the vocabulary used does not say that it is an instance of that particular object class. To get away from agent, which seems to confuse things, I offer bf:PlayingSpeed, one of several subclasses of bf:SoundCharacteristic, all using the property bf:soundCharacteristic. There is now a vocabulary (but it is a newer one, not present when we started Sinopia), but there is nothing in that vocabulary that says it's a bf:PlayingSpeed. We need to do that. And the only way is to put in that extra blank node with an rdf:value, which is not good rdf. I'll answer specific questions in another comment.

NancyL commented 3 years ago

To answer Michelle's questions from some time back:

  1. are there other examples besides Agent Class in existing templates where you need to create additional statements about the object? Yes many. But this is not always about additional statements. There may just be the object URI & the type of object URI. in the case of needing to say what class an Agent is...are there other ways to express that an Agent is an instance of a particular class? if we are using agent URIs from Wikidata, for example, could you add a statement in Wikidata: agent123 hasType bf:person? You would still need a blank node to do this.
NancyL commented 3 years ago

Regarding the LC profiles--the buttons are sub-profiles.

NancyL commented 3 years ago

And @sfolsom, this is truly about BF, not MARC, though of course it would help for conversion to MARC>

michelleif commented 3 years ago

regarding the range use case, the range is implied by the property and doesn't need to be stated

if a property has a range, then any object of that property is by definition a member of the range's class

Example from https://en.wikipedia.org/wiki/RDF_Schema if ex:employer rdfs:range foaf:Organization then from this the statement: ex:John ex:employer ex:CompanyX it can be inferred that ex:CompanyX is a foaf:Organization

NancyL commented 3 years ago

That is true in theory. But we do not have any inferencing available to us, so I don't think that really works here. Also, given that both SVDE & LC add these into their data, and we want to interact with that data, I think it still stands as a use case.

justinlittman commented 3 years ago

I just took a closer look at this and realized my earlier assessment was incorrect. I now believe the problem is that we are injected an unnecessary level of nested resources in the template structure.

The current model:

ld4p:RT:bf2:Monograph:Work:Un-nested (template)
   has class http://id.loc.gov/ontologies/bibframe/Work
   has property http://id.loc.gov/ontologies/bibframe/contribution

http://id.loc.gov/ontologies/bibframe/contribution (property)
   has value pcc:bflc:PrimaryContribution (template) or pcc:bf2:Contribution (template)

pcc:bflc:PrimaryContribution (template)
   has class http://id.loc.gov/ontologies/bflc/PrimaryContribution
   has property http://id.loc.gov/ontologies/bibframe/agent

http://id.loc.gov/ontologies/bibframe/agent (property)
   has value pcc:bf2:Agent:Conference (template), pcc:bf2:Agent:CorporateBody (template), etc.

pcc:bf2:Agent:Conference (template)
   has class http://id.loc.gov/ontologies/bibframe/Meeting
   has property http://www.w3.org/1999/02/22-rdf-syntax-ns#value

http://www.w3.org/1999/02/22-rdf-syntax-ns#value
   is a lookup of LOC all names 

If this were changed to the following model, the extra blank node would be removed:

ld4p:RT:bf2:Monograph:Work:Un-nested (template)
   has class http://id.loc.gov/ontologies/bibframe/Work
   has property http://id.loc.gov/ontologies/bibframe/contribution

http://id.loc.gov/ontologies/bibframe/contribution (property)
   has value pcc:bflc:PrimaryContribution (template) or pcc:bf2:Contribution (template)

pcc:bflc:PrimaryContribution (template)
   has class http://id.loc.gov/ontologies/bflc/PrimaryContribution
   has property http://id.loc.gov/ontologies/bibframe/agent

http://id.loc.gov/ontologies/bibframe/agent (property)
   is a lookup of LOC all names, etc. 
justinlittman commented 3 years ago

This is implemented for primary contributors in jlit:RT:bf2:Monograph:Work:Un-nested on stage and produces:

<> a <http://id.loc.gov/ontologies/bibframe/Work>;
    <http://id.loc.gov/ontologies/bibframe/contribution> _:b93.
_:b93 a <http://id.loc.gov/ontologies/bflc/PrimaryContribution>;
    <http://id.loc.gov/ontologies/bibframe/agent> <http://id.loc.gov/authorities/names/n79032058>.
<http://id.loc.gov/authorities/names/n79032058> <http://www.w3.org/2000/01/rdf-schema#label> "Wittgenstein, Ludwig, 1889-1951@en".

instead of:

<> a <http://id.loc.gov/ontologies/bibframe/Work>;
    <http://id.loc.gov/ontologies/bibframe/contribution> _:b172.
_:b172 a <http://id.loc.gov/ontologies/bflc/PrimaryContribution>;
    <http://id.loc.gov/ontologies/bibframe/agent> _:b173.
_:b173 a <http://id.loc.gov/ontologies/bibframe/Person>;
    <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> <http://id.loc.gov/authorities/names/n79032058>.
<http://id.loc.gov/authorities/names/n79032058> <http://www.w3.org/2000/01/rdf-schema#label> "Wittgenstein, Ludwig, 1889-1951@en".
NancyL commented 3 years ago

We still need to say Wittgenstein is a bf:Person.

justinlittman commented 3 years ago

Why? We cache the label because we need it for display in the editor, but why do we need the class?

NancyL commented 3 years ago

Only NAF would say that Wittgenstein is a bf:Person outside of Sinopia and not a corporate body or a family, etc. No other vocabulary would do so; nor would it happen if the name was not yet part of a vocabulary that has URIs. It is also a matter of compatibility in that both LC and SHARE-VDE explicitly state this.

justinlittman commented 3 years ago

Here's what I think we want to do:

I don't think we should do any "magic" (like try to guess when to remove blank nodes). This is likely to be confusing (both in terms of code and users) and very Bibframe-specific.

jermnelson commented 3 years ago

Here's what I think we want to do:

  • Add a new property to sinopia:template:property:lookup for a list of value classes (which are URIs). So, for example, for the http://id.loc.gov/ontologies/bibframe/agent property of the pcc:bflc:PrimaryContribution template, this list could include bf:Meeting, bf:Conference etc.

I think this similar in concept to LOC's Profile Editor Value Data Type except it would allow for multiple values instead of just one.

NancyL commented 3 years ago

Okay, regarding this I probably concentrated on the wrong aspect, but this is a summary of the problem & a response to questions. I would also like to emphasize this does not only apply to bf:Contribution, but to any template with lookups.

  1. While it is true that a property "if a property has a range, then any object of that property is by definition a member of the range's class" that is relying on inferencing, which we do not have in Sinopia.
  2. While we do not necessarily need the final class/subclass if it is a lookup, we quite often need to provide a new vocabulary term that is not part of a lookup we have or that is part of a vocabulary that does not have URIs. In this case, we need to turn a literal into an object and name the class it is an instance of. So even if the lookup does not need the class named, the reified literal does. Until you provided the possible solution above this would require us to duplicate the property and modeling it separately which we cannot do. But the solution above could just work though I'm not completely clear on it.
  3. This solution might also work in the case of properties like bf:soundCharacteristic for which the subclasses of bf:SoundCharacteristic are highly varied and have very separate vocabularies. Here the problem is again the non-repeatability of properties resulting in simply a list of lookups, but in this case those lookups only apply to certain subclasses. A single property doesn't allow for individual guidance for the subclasses.
justinlittman commented 3 years ago

@NancyL Can you provide some examples of 2 and 3?

NancyL commented 3 years ago

For no. 2. I am cataloging a whole bunch of piano rolls, either tempo 75 or tempo 80. It is important to me to have these interpreted and then listed in playback speeds for my larger sound archive. Interpreted, they mean speeds of 7.5 feet per minute and 8 feet per minute. These values are not included in the Playing speed vocabulary in id.loc.gov, nor in any other linked vocabulary that I know of and it is unlikely LC would add them to their vocabulary, because they do not catalog piano rolls in this fashion. Thus, I need to add them in as new vocabulary terms, as instances of bf:PlayingSpeed, and as objects with URIs of their own.

NancyL commented 3 years ago

For no. 3, my example is bf:SoundCharacteristic. Its subclasses are: bf:RecordingMethod, bf:RecordingMedium, bf:PlayingSpeed, bf:GrooveCharacteristic, bf:TrackConfig, bf:TapeConfig, bf:PlaybackChannels, bf:PlaybackCharacteristic. All of these have separate vocabularies in id.loc.gov that apply only to the specific subclass.

justinlittman commented 3 years ago

After stewing over this, here is my latest proposal:

A resource template should have resource properties (just like a property template has property attributes). For now, there is only one possible property attribute: "suppressible" (or some similar term).

When a resource is suppressible and if it only has one property template, then when used as a nested resource and the value is a URI, the URI will be used as the resource and the class and label will be recorded. When used as a nested resource and the value is a literal, a blank node will be created as it currently is.

I think this will satisfy all of the use cases described in this ticket.

NancyL commented 3 years ago

I think that sounds right! Will you provide an example to look at when you can?

justinlittman commented 3 years ago

Better yet, deploying to dev environment now. I'll let you know when it is ready for testing.

justinlittman commented 3 years ago

@NancyL OK to close this ticket?

NancyL commented 3 years ago

Yes, though there will likely be follow-up enhancements requested on new tickets...