digirati-co-uk / dlcs-search-service

Search service for IIIF Content Search and annotation indexing.
MIT License
3 stars 0 forks source link

IIIF Content Search: Query parameters #6

Open mattmcgrattan opened 6 years ago

mattmcgrattan commented 6 years ago

The service should support the standard IIIF Content Search Parameters.

https://iiif.io/api/search/1.0/#query-parameters

tomcrane commented 6 years ago

For our purposes I would treat xywh and t as IIIF standard query params. The only reason they are not in the spec is that we wanted to wait to see how Presentation 3 treated time. Earlier Search API drafts have a fifth region param which was to have taken x,y,w,h values.

For MVP then:

[ ] xywh: spatial media fragment (as per media fragment spec). Server only returns annotations that intersect this region (rather than are wholly contained by this region). An extension flat could possibly toggle intersection/containment. The presence of this parameter implicitly limits annotations that can be returned to annotations that target space (but see note below). [ ] t: temporal fragment (as per media fragment spec). Server only returns annotations that intersect this time period

TBC (and here we are essentially drafting Search API 2)

Combinations of behaviours. Say you have annos on video that only have t targets (like those in https://openhypervideo.github.io/iiif-interactive-transcript/):

Example - https://tomcrane.github.io/bbctextav/iiif/ID194804400-transcript.json

...but the canvases are spatial (as they are here, it's a video). If I fire off xywh=100,100,200,200&=10,20 then I must get all the annos that overlap t=10,20, because they have an implicit spatial target of the whole canvas. But If it was a radio broadcast, where the transcription anno targets look the same but the target has no spatial dimensions, what happens? The search server that indexed the annos has no idea that the anno targets have a spatial extent (we can't require dereferencing and may not even be possible). In this particular case the targets of the indexed annos are implicitly t=t1,t2&xywh=0,0,1920,1080 but even that info alone doesn't tell you everything, you'd still need to know at query time that the target canvas itself isn't a 4K video - we don't know that 1920,1080 is the full extent of the target, and therefore implicit in the query.

(need to write this up as a IIIF issue for Auth 2)

We need a param that identifies whether the spatial or temporal constraint should cause annos that lack targets with those dimensions to be excluded or included in the response.

Is that enough though to meet use cases?

All the search server indexes is the annos. These would look identical for a radio broadcast and a TV broadcast. https://tomcrane.github.io/bbctextav/iiif/ID194804400-transcript.json

Another issue, less difficult I hope:

Point times (where intersect means overlap) - we run into same problem that causes Presentation API to introduce a point selector (there's no notion of points in W3C Media Fragments spec; a point is not a fragment).