IIIF / api

Source for API and model specifications documents (api and model)
http://iiif.io/api
106 stars 54 forks source link

`@list` type containers are not suitable for some presentation API properties #994

Open christopher-johnson opened 7 years ago

christopher-johnson commented 7 years ago

This is a general issue / problem that has been raised in the mailing list specific to sc:hasAnnotations where the content search API expects an Annotation List, but the response is typically an unordered set. Constraining a container property with @list means that a fromRDF serialization must be a Collection in for it to be framed in an array, Similarly, for sc:metadataLabels, it is not optimal to model a set of key/value pairs as a list, when there may be no value relationship between the pairs, and the key is a string label. In contrast, sc:hasCanvases should remain an @list as these objects need to be modeled as a collection to retain an explicit sequence in serialization. The distinction between an @set and an @list type containers in the context primarily affects framing, so I believe that implementations not using framing would not be affected if this were changed.

azaroth42 commented 7 years ago

I don't see how "the response is typically an unordered set" for a list of annotations. The response is an AnnotationList, and in the Web Annotation work, the exact equivalent AnnotationPage. And for metadata, the order is intentional -- the order in which the labels should be presented to the user.

azaroth42 commented 7 years ago

I propose close invalid wontfix. The situations where we have @list in the context are by design.

mikeapp commented 7 years ago

👍 to closing

jpstroop commented 7 years ago

:+1:

christopher-johnson commented 7 years ago

I am sorry, but your response to this issue is incorrect. If a search query response is in the form of RDF from a SPARQL interface, then the triples will not be in the form of an RDF collection which is equivalent to a JSON-LD @list.

azaroth42 commented 7 years ago

If the data is a SPARQL response, then it's not in the right form anyway. Just like if it was a SOLR response, an ElasticSearch response, a MongoDB response, an OpenSearch response, an SRU response, name your favorite back end system and chances are it won't be in exactly the right structure. Facebook's GraphQL might be the exception, as it specifies transformation and structure rules along with the query.

If you choose to use a particular technology, that's great. But the use cases for those fields have determined that order is important, and therefore @list is correct in the context document.

christopher-johnson commented 7 years ago

So, how then does a back end implement the IIIF "determined order" for a full text search? Most FT search engines use ranking algorithms to implement ordering, but there is no guarantee that the order persists over time.

azaroth42 commented 7 years ago

It doesn't matter how (or even if) the backend sorts a list. It matters that the representation that's transferred over the wire preserve that order for the client in that particular transaction. If the annotations were a set, not a list, and the client ran them through a JSON-LD to turtle processor, they would come out in a random order, and that is not an acceptable situation.

christopher-johnson commented 7 years ago

An @set is not random. It retains the order of the graph, but in framing, does not require the explicit ordering semantics of the RDF collection for serialization. For example, a graph with triples ordered like this: <> prop:text "text1" . <> prop:text "text2" . <> prop:text "text3 will always be framed in a JSON-LD @set as: "prop:text" : ["text1", "text2". "text3"]

Check this gist to see how @set works for deserialization. The order of the array is preserved in the graph and is not random.

azaroth42 commented 7 years ago

I'm sorry, but graph edges in RDF are unordered. If I serialize those triples repeatedly the order will be in whatever the specific implementation does to ensure that the same triples do not occur multiple times.

From the JSON-LD spec:

@set Used to express an unordered set of data and to ensure that values are always represented as arrays.

christopher-johnson commented 7 years ago

I agree that there is no explicit order in RDF without RDF collection semantics. But, a JSON array does have an explicit order, and JSON-LD deserialization will always follow that order, so the argument that array values will be "randomized" on deserialization because of @set is invalid.

It is not clear to me how explicit ordering could work for an LD client in your data model outside of sequences. I know that none of the existing clients implement annotation list ordering in any way. How do you perceive ordered annotation lists and metadata being implemented by a JSON-LD aware client?

azaroth42 commented 7 years ago

Order for annotations: The reading order of the transcribed bits of text. Order for metadata: As above, the order in which the label/value pairs should be displayed.

But regardless, the context is correct as it stands because the range of the predicates that have @type of @list is rdf:List.

christopher-johnson commented 7 years ago

I make a final appeal with this explanation. All JSON-LD coexists as RDF in this data model. A basic characteristic of @list as an rdf:List is that it is a fixed and predefined structure with a finite number of elements. An RDF data provider, therefore, cannot (effectively) construct RDF lists "on demand", which is what happens when a keyword or type Search API query is executed. This should be clear and is the main reason why I have raised this issue.

In my understanding, the display of the annotations on the target should not depend on a predefined order of the data, but rather on canvas coordinates. The way that a client wants to display the metadata pairs should be a choice left to the client. I do not see why the server should care at all how the client displays the metadata. And this may end up not working if metadata sources for a client need to be extended beyond a single provider.

If it is not changed in your context, then an RDF data provider has to implement its own context for framing, which is not difficult, and what I have had to do to work around several context issues already. This is a definite issue that you should be aware of, even if it is not possible to fix under whatever premise you deem rational.

azaroth42 commented 7 years ago

Yes, the challenges of expressing order in triplestores/with SPARQL are well understood, but the use cases (as expressed) are valid and need to be supported. rdf:List is the best way to do that.

Closing, wontfix.

mielvds commented 3 months ago

@azaroth42 sorry to bring up this old issue, but we have an RDF stack and generating an rdf:List is an issue. What are the alternatives to sort newspaper pages, for instance?

azaroth42 commented 3 months ago

The items in a Range are still an rdf:List, so that doesn't really help.

If you can't manage rdf:Lists in your backend, then I would indeed use ListItem and then write some custom transformation code to turn the list of ListItems into a regular JSON array before sending it out.

There's also ore:Proxy (https://openarchives.org/ore/1.0/datamodel#Proxies ) but schema is likely better understood.

mielvds commented 3 months ago

Well that's dissapointing :) So a RDF infrastructure with SPARQL CONSTRUCT is quite useless to build manifests on the fly.

The actual use of schema:ItemList or ore:Proxy is also useless because they are not part of the IIIF Presentation API en hence, no viewer will interpret them (and people want to browse a newspaper in the right page order it seems). Would a future version of IIIF be open to adding an alternative way of ordering items for all use cases where the array order is for some reason not the intended order. Rough yet-to-be-thought-through example:

{
  // Metadata about this canvas
  "id": "https://example.org/iiif/book1/canvas/p1",
  "type": "Canvas",
  "label": { "none": [ "p. 1" ] },
  "height": 1000,
  "width": 750,
  "items": [
    {
      "id": "https://example.org/iiif/book1/content/p1/1",
      "type": "AnnotationPage",
       "ordering": {
          "id": ...,
          itemListElement: [    //@set
           { "id" ..., item:  "https://example.org/iiif/book1/content/p1/1", position: 1}
          ]
       },
       "items": [
            { "id": "https://example.org/iiif/book1/annotation/p0001-image", ... }
       ]
    }
  ]
}
azaroth42 commented 3 months ago

Reopening to allow further discussion

mielvds commented 1 month ago

@christopher-johnson how are you dealing with this issue as of now?

mielvds commented 1 week ago

@azaroth42 what's the right way to draft a proposal and have the discussion properly?