Closed TEITechnicalCouncil closed 4 years ago
Indeed: annotationBlock must be the elementary unit of representation in standOff annotations.
F2F (Victoria, 2017) Council agrees that @peterstadler and @laurentromary should go ahead with working on this.
F2F (Victoria 2017): we need to move forward with implementing LinkDataBlock.
@laurentromary and I just had a conf call discussing this issue and the further roadmap. He made a strong point about not merging the standoff proposal with a 'linkDataBlock' proposal, because the first is about annotating some text (thus pointing into the text) whereas the second is about adding editorial content to some text (and to which is pointed from the text). There has been some confusion about these distinctions (including myself), so I hope Laurent will elaborate on this!
Just stumbled across https://github.com/one-step-beyond/tei-standoff (but didn't take a closer look) Anyone familiar with this?
Apparently a Turska-Spadini production. Don't you TEI councillors talk to each other any more?
Maybe we should start discussing the creation of elements. We have 2 to create:
I am also in Tokyo from Tuesday to Friday of the TEI conference. If a group from the council is ready to make a move towards implementation, we could simply have an operational session there.
@peterstadler an attempt I was considering with Elena Spadini back in the day, virtually dead now, though some of that thinking was incorporated in an approach we used later for earlyPrint (namely using existing TEI elements as the body of the annotation)
@laurentromary would be great to have a session in Tokyo! So far we have agreed to get our act together in a small group to get back to Council
A wrapper <standOff>
element has been created in 565fc72caec15d3569e496b5e71fb05ca772b158 and will be available in the upcoming release 3.7.0.
There is probably more to be fleshed out concerning the content of <standOff>
but this qualifies dedicated tickets, so closing this one here.
The annotation of documents using standoff annotations is a very useful and flexible methodology. Nevertheless, TEI does not have any specific elements for encoding this information. In most of cases, the standoff annotations are stored as external TEI files linked to the text being annotated. Nevertheless, this way of storing the standoff annotations is very rigid and presents numerous problems, for example, for indexing or searching the corpus of documents using the information of the annotations. In these cases, it would be very useful to have the standoff annotations INSIDE the TEI documents being annotated (!!!).
Therefore, it is suggested to include define a new set of TEI elements specifically dedicated to the encoding of the standoff annotations.
The idea would be to store the standoff annotations between the <teiHeader> and the <text>, following the same philosophy as used for the <facsimile> and for <sourceDoc> (in some way these two elements could also be considered as a "type" of annotation).
For the standoff annotation, the structure could be:
<TEI> <teiHeader> ... </texHeader> <standoff> [information of the annotations] </standoff> <text> ... </text> </TEI>
This structure would provide the extra advantage of allowing to annotate the information at different TEI levels in a natural manner. So for more complicated TEI documents having different hierarchical levels, the standoff annotations could be encoded as follows:
<teiCorpus> <teiHeader> ... </teiHeader> <TEI> <teiHeader> ... </texHeader> <standoff> ... </standoff> <text> ... </text> </TEI> <TEI> <teiHeader> ... </texHeader> <standoff> ... </standoff> <text> ... </text> </TEI> </teiCorpus>
This structure would also provide the extra advantage of allowing to annotate, not only the text of the document, but also the metadata of the different hierarchical levels of the TEI document.
The specific encoding of the annotations inside <standoff> could be as follows:
<standoff> <annotation type="..." subtype="..."> <author>...</author> <date>...</date> <ptr>...</ptr> [other data needed] </annotation> </standoff>
As a last remark it is also suggested to allow inside the <annotation> the TEI element <figure> in order to facilitate the annotation not only of textual information, but also of images and formulas.
Conclusion: the proposed structure for the encoding of standoff annotations in TEI provides the following advantages:
- allows to encode standoff annotations under TEI in a natural manner, which is not the case at the moment
searching said documents
Remark: this idea has been already suggested by Piotr Bański in his article "Why TEI stand-off annotation doesn't quite work and why you might want to use it nevertheless", in http://www.balisage.net/Proceedings/vol5/html/Banski01/BalisageVol5-Banski01.html
Original comment by: sf_user_posejavier