FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for processing FoLiA is implemented as part of PyNLPl, this contains higher-level tools that use the library as well as the full documentation, validation schemas, and set definitions
The observation element is a span annotation element that makes an observation pertaining to one or more word tokens. It is embedded in an observations layer. Observations offer a an external qualification on part of a text. The qualification is expressed by the class, in turn defined by a set. The precise semantics of the observation depends on the user-defined set.
The element may for example act as a more generic replacement for the errordetection element, or to encapsulate observations from teachers/proofreaders on a text, in which case it is often used with the desc element. The following example shows observations from two fictitious sets:
<s>
<w xml:id="w1"><t>The</t></w>
<w xml:id="w2"><t>Dalai</t></w>
<w xml:id="w3"><t>Lama</t></w>
<w xml:id="w4"><t>greets</t></w>
<w xml:id="w5"><t>himm</t></w>
<w xml:id="w6"><t>.</t></w>
<observations>
<observation class="typo" set="http://somewhere/errordetection.set.xml">
<wref id="w5"/>
</observation>
</observations>
<observations>
<observation class="encouragement" set="http://somewhere/teacherobservations.set.xml" annotator="teacher234" annotatortype="manual">
<wref id="w1" />
<wref id="w2" />
<wref id="w3" />
<wref id="w4" />
<wref id="w5" />
<wref id="w6" />
<desc>Almost a good sentence, only one mistake. Keep up the good work!</desc>
</observation>
</observations>
</s>
As always, further attributes can be associated with any observation using FoLiA's feature mechanism.
The
observation
element is a span annotation element that makes an observation pertaining to one or more word tokens. It is embedded in anobservations
layer. Observations offer a an external qualification on part of a text. The qualification is expressed by the class, in turn defined by a set. The precise semantics of the observation depends on the user-defined set.The element may for example act as a more generic replacement for the
errordetection
element, or to encapsulate observations from teachers/proofreaders on a text, in which case it is often used with thedesc
element. The following example shows observations from two fictitious sets:As always, further attributes can be associated with any observation using FoLiA's feature mechanism.
(proposal inspired on Revisely's solution)