srophe / syriaca-data

Repository for Syriaca.org TEI data, used by srophe-eXist-app.
4 stars 16 forks source link

Keeping CBSC keywords as a semi-distinct subset of the Syriaca taxonomy #930

Open wlpotter opened 2 years ago

wlpotter commented 2 years ago

@dlschwartz noted in this comment that Sergey may want to use only a CBSC-specific subset of Syriaca keywords:

Column C [in the taxonomy spreadsheet] was mainly about keeping track of CBSC keywords. We're not currently transforming that. However, at one point in discussing things with Sergey, he was concerned with the expansion of the keywords. He didn't really want to engage with all of the new ones. Or at least he wanted a way to work with a subset of the growing list of keywords. We could use this column to designate a subset of CBSC keywords. That said, we wouldn't necessarily need that in the TEI record. That subset could simply live in Zotero. This might need some discussion.

dlschwartz commented 2 years ago

I'm just sort of thinking on screen here. We'll probably want to discuss this.

The current transform takes columns H and I and turns them into <idno type="URI">. This isn't right. I think these should instead be a skos:closeMatch encoded using a <relation> element.

We would then want a column to do the same with the SNAP relationships that are also close matches.

@davidamichelson perhaps you could weigh in here just to make sure we're being consistent with our use of ISO language codes. Do we consider a Syriaca URI for a language a skos:exactMatch or a skos:closeMatch. [I'm pretty sure skos:sameAs would be incorrect but correct me if I'm wrong.]

Of course, none of this addresses this issue directly. But it helps. I think we would want a narrower Syriaca relationship to have a skos:broadMatch with a broader SNAP concept when no close match exists in SNAP. So we would then have something like: <relation ref="skos:broadMatch" active="http://syriaca.org/keyword/bishop-over-monk" passive="snap:professionalRelationship"/>. This would facilitate us sharing a bishop-over-monk relationship with snap as a simple professional relationship.

A couple of questions we need to deal with:

wlpotter commented 2 years ago

@dlschwartz we should discuss this (though is this a separate issue than marking the CBSC subset of keywords?)

A few thoughts and notes on the questions you raised:

The mapping properties skos:broadMatch, skos:narrowMatch and skos:relatedMatch are provided as a convenience, for situations where the provenance of data is known, and it is useful to be able to tell at a glance the difference between internal links within a concept scheme and mapping links between concept schemes.

I wonder if we could use skos:broader, skos:narrower for the hierarchies of Syriaca categories and the mapping properties for lining up external concept schemes? From the SKOS perspective, as I understand it, this is an allowable convention rather than a hard distinction:

The rationale behind this design is that it is hard to draw an absolute distinction between internal links within a concept scheme and mapping links between concept schemes. This is especially true in an open environment where different people might re-organize concepts into concept schemes in different ways. What one person views as two concept schemes with mapping links between, another might view as one single concept scheme with internal links only. This specification allows both points of view to co-exist, which (it is hoped) will promote flexibility and innovation in the re-use of SKOS data in the Web.

wlpotter commented 2 years ago

Also relevant to this discussion is section 9.6.4 SKOS Concepts, Concept Collections and Semantic Relations under Concept Collections:

In the SKOS data model, skos:Concept and skos:Collection are disjoint classes. The domain and range of the SKOS semantic relation properties is skos:Concept. Therefore, if any of the SKOS semantic relation properties (e.g., skos:narrower) are used to link to or from a collection, the graph will not be consistent with the SKOS data model.

I think that since the mapping properties also derive from skos:semanticRelation (cf. section 10.3: "skos:closeMatch, skos:broadMatch, skos:narrowMatch and skos:relatedMatch are each sub-properties of skos:mappingRelation"; and "skos:mappingRelation is a sub-property of skos:semanticRelation."), the above caution refers both to the semantic relations (e.g., skos:narrower) and the mapping properties (e.g., skos:broadMatch).

dlschwartz commented 2 years ago

@wlpotter I agree that we should be using @ref more broadly as we up our LOD game. For example, we need to reconcile our list of religious communities and our list of place types with the taxonomy, creating new URIs where necessary, and use @ref with the URI for these.

dlschwartz commented 2 years ago

@wlpotter to get this issue back to the question you originally asked, could this be as simple as adding a @subtype="CBSC" on the relevant tei:idno? We might need to discuss this. I don't know how we would manage synchronizing data between the Taxonomy and the place where Sergey will use this information, i.e. Zotero. I think we might just want this to live there.

wlpotter commented 2 years ago

@dlschwartz I think most of the above has been continued in https://github.com/wlpotter/csv-to-srophe/issues/35.

I'd be fine leaving the designation of a keyword as CBSC in Zotero -- unless we would need this for the Srophe app to allow users to only browse by CBSC keyword (if that's a scenario, I'm not sure what the ultimate plan is regarding CBSC and the Syriaca bibl module).

We already use entryFree/@subtype for "category" (column E here). A few options off the top of my head:

Again, I think a lot of this would depend on how we ultimately envision the relationship between the CBSC and the Syriaca bibl module as well as the relationship between the CBSC set of keywords and the portion of the Syriaca taxonomy that overlaps with them.

dlschwartz commented 2 years ago

Also remember that by definition the only keywords on bibliography will be the keywords used in Zotero. Bibliography browse would just contain the options to look according to what's already there and not what's absent. If someone typed in a keyword to search the bibliography and it isn't in use, they just wouldn't get any results.

wlpotter commented 2 years ago

@dlschwartz @davidamichelson from our discussion on Wednesday, I believe this issue can be closed? The taxonomy-specific discussion has moved here.

For CBSC, we have decided to use the Zotero tag set based on Sergey's needs. These tags will have a reference to the entity URIs for keywords, persons, places, works, etc. The zot2bibl or another maintenance script will process these tags and create the links between bibl records and the entity referenced by the Zotero tag.

wlpotter commented 2 years ago

@dlschwartz I believe this has been resolved but I will leave open if you need/want to document any of the above discussion for taxonomy purposes.