OnroerendErfgoed / atramhasis

An online SKOS editor
http://atramhasis.readthedocs.io/
GNU General Public License v3.0
52 stars 11 forks source link

Support for SKOS OrderedCollection #723

Open mielvds opened 1 year ago

mielvds commented 1 year ago

Would it be possible to extend the support for Collection to OrderedCollection as well?

koenedaele commented 1 year ago

What's the use case you're trying to support?

The one we have is sorting concepts chronologically in our thesaurus of periods: https://thesaurus.onroerenderfgoed.be/conceptschemes/DATERINGEN

If you look at https://thesaurus.onroerenderfgoed.be/conceptschemes/DATERINGEN/c/1251, you will see that it's narrower periods are ranked chronologically, from oldest to newest. We do this by adding something called a Sort Label to the concepts in that collection:

Basically, any string you want and it gets sorted according to that string (so we left some empty spots in cases the ordering needs to change again).

A few consequences of this:

So, I'm not sure if this is sufficient for your needs. Attaching the sort order to a relation is theoretically more accurate, but probably not that easy on the datamodel and the UI. The current implementation isn't the easiest on the editor either (you have to think of sort labels that make some sense), but since it's a usecase we only have in one thesaurus it's manageable (and I see even in that thesaurus some of the data has lot the sortLabel).

If the above is sufficient for your needs, it should be possible to look at exporting this information to RDF in the form of skos:OrderedCollection. And vice-versa, making it possible to import this again by generating sortLabels for an OrderedCollection. Complexities I see in this would be how to export a concept with narrower concepts that have been ordered. I think this could be done by inserting a kind af anonymous orderedcollection to indicate that the subconcepts should be ordered. But I expect that brings some other complexities with it we would need to think through.

So, first things first, would something like this be enough for you?

mielvds commented 1 year ago

My concrete use case: maintaining the fixed order I got from what used to be a table of term definitions in a way that is SKOS compliant. They were manually sorted according to the order in which they should appear in end-user applications. But in the end, I'm mainly concerned about loosing the OrderedCollection after importing into Atramhasis.

Wrt to the practical implementation. I think that what your descibe would suffice is that can somehow be translated to the SKOS import/export. As far as I know, skos:OrderedCollection is the only way to describe order using SKOS, but I have to admit that the use of an RDF list is not practical, especially when using SPARQL (queries are very complicated and extremely slow).

I see you currently map to skos:hiddenLabel, but that is not semantically accurate and will cause problems in the long run (eg. in full-text-search indexes and when there are multiple hidden labels). I recently got the suggestion to (also) use schema:position.

So to sum up:

koenedaele commented 1 year ago

The mapping to skos:hiddenLabel was mostly a quick fix to be able to keep all data on export. On import they just stay as hidden Labels that can then be changed back to sort labels by the editor. I've never really see a good use case for hidden labels so far, but someone probably has.

I'm cerrtainly open to other options. skos:OrderedCollection does indeed look to be the official way. I'm alos not too keen on the RDF:lists, but I think we can make something work. Certainly for simple cases.

The schema:position is interesting. I did think about creating an atramhasis:sortLabel or such, but I didn't really want to create yet another ontology. So, schema:position might be a good fit. The range looks to be schema:Integer or schema:Text. Simplest implementation would be exporting the sortLabel to schema:position as an rdf literal and reading that on import as well. Ignoring non rdf literal values. We could decouple the position from the list of labels in the skosprovider interface or even make it language independent, but that is a bigger change and I'm not certain it's worth it for a fairly rare use case.

How would you handle skos:OrderedCollection when dealing with sorting the narrower concepts of a concept? Adding an orderedcollection as an anonymous resource in between the broader concept and the narrower concepts just to create the order? Collections with a URI get imported as editable collections, collections without a URI just pass on the ordering to their narrower concepts.

So, we would have two ways of defining arbitray orders on export/import: schema:position and skos:orderedcollection. Where on import schema:position would take precedence on skos:orderedcollection.

Do you have any example files you could share? Would be useful to have as fixtures for unit tests.

mielvds commented 1 year ago

The mapping to skos:hiddenLabel was mostly a quick fix to be able to keep all data on export. On import they just stay as hidden Labels that can then be changed back to sort labels by the editor. I've never really see a good use case for hidden labels so far, but someone probably has.

We're are probably going to need them to make concepts findable using "slang". Labels you don't want to endorse or display, but you know users search for them. For example, even though the NL label is "Sinaasappel", they would search for "Appelsien".

I'm cerrtainly open to other options. skos:OrderedCollection does indeed look to be the official way. I'm alos not too keen on the RDF:lists, but I think we can make something work. Certainly for simple cases.

I think it's just a matter of respecting the order when writing to the database at import? RDFLib luckily has this Collection class that allows handling lists as a python iterable.

The schema:position is interesting. I did think about creating an atramhasis:sortLabel or such, but I didn't really want to create yet another ontology. So, schema:position might be a good fit. The range looks to be schema:Integer or schema:Text. Simplest implementation would be exporting the sortLabel to schema:position as an rdf literal and reading that on import as well. Ignoring non rdf literal values.

That would work!

We could decouple the position from the list of labels in the skosprovider interface or even make it language independent, but that is a bigger change and I'm not certain it's worth it for a fairly rare use case.

I agree

How would you handle skos:OrderedCollection when dealing with sorting the narrower concepts of a concept? Adding an orderedcollection as an anonymous resource in between the broader concept and the narrower concepts just to create the order? Collections with a URI get imported as editable collections, collections without a URI just pass on the ordering to their narrower concepts.

We don't have this use case where collections are in between concepts, but I think you use a skos:OrderedCollection wherever you use askos:Collection. The only difference is that it adds a skos:memberList predicate that repeats the members in sequence. In the UI you could even solve it with a checkbox that indicates that the current (sort) order should be maintained.

Granted, a OrderedCollection doesn't make much sense is the order is a result of sorting; it's the ability to introduce and maintain a custom order that you want. The UI should support that, but it's a low-priority feature.

So, we would have two ways of defining arbitray orders on export/import: schema:position and skos:orderedcollection. Where on import schema:position would take precedence on skos:orderedcollection.

I'd say these are two different things.

Do you have any example files you could share? Would be useful to have as fixtures for unit tests.

This list of "Vakken": https://github.com/i-Learn-SKOS/common-conceptschemes/blob/main/common/schemes/vak-norelated-final.skos.ttl