Proposal for adding 'timed-text' in the Text Granularity Extension

glenrobson commented 1 year ago

Links

Pull Request: https://github.com/IIIF/api/pull/2221
Preview: https://github.com/IIIF/api/pull/2221/files

Background and Summary

(Copied from pull request)

Captions and subtitles for video objects can be made available via the IIIF Presentation API by using annotations on the canvas that contains the media file. For reference, see the Section "Captions and Subtitles" for the recipe "Transcripts, Captions, and Subtitles - General Considerations".

Just like the OCRed text of a newspaper can be provided via annotations with spatial coordinates on a images, captions and subtitles may be provided as annotations with temporal coordinates (a cookbook recipe for providing captions and subtitles as annotations is planned for the IIIF cookbook).

The Text Granularity Extension allows one to indicate the level of text granularity for an annotation (block, line, etc.), but currently it does not contain a suitable granularity value for the case of captions and subtitles, whose text granularity is neither paragraphs of sentences. The text granularity of captions/subtitles follows standard subtitling guidelines in terms of reading speed, number of lines in each subtitle, line length (number of characters), minimum and maximum subtitle duration, and minimum interval between two consecutive subtitles.

Proposed Solution

In conclusion, the text granularity of captions and subtitles is specific to these resources, and this proposal consists in adding the text granularity level 'timed-text' to the levels defined by the extension.

zimeon commented 1 year ago

I'm still struggling with whether this is a granularity like line, block etc. or really something orthogonal. I think I lean toward @nfreire's argument that this is comparable (one wouldn't also say it was line or block) but I think this needs some more discussion as a group to get agreement before moving forward.

Naming -- why not timed rather than timed-text? We don't say block-text.

If/when this moves forward we should update date of addition (currently 2023-03-07) in the document history block before merge.

nfreire commented 1 year ago

Regarding the naming, "timed-text" is an established generic term in the audio-visual community to refer to captions, subtitles, etc.

triplingual commented 1 year ago

Where I resolve the orthogonality in my head is that the other granularities are dimensions of text, and time is a dimension of AV. Also that the "text" in AV, if spoken or signed, say, may not have lines or grafs but will often be captioned in multi-word units.

But AV text can be captioned in words even if the text is audio or gestural, so I do also think that some discussion needs to go into refining these text granularities to account for transcriptions of visual text in a video (e.g. Barbara Kruger's or Jenny Holzer's video work) and gestural language (with dimensions that are not necessarily graf/line/word/glyph).

glenrobson commented 1 year ago

Issue 117 (Proposal for adding 'timed-text' in the Text Granularity Extension)

+1: 1 [triplingual] 0: 4 [julsraemy kirschbombe regisrob zimeon] -1: 0 [] Not TRC: 0 [] Ineligible: 0 []

Result: 1 / 5 = 0.20

Issue is rejected

IIIF / trc