annotation / stam

Stand-off Text Annotation Model (STAM) is a data model for stand-off-text annotation where any information on a text is represented as an annotation. This repository contains the model's full specification, extensions, schemas, examples and documentation.
https://annotation.github.io/stam/
Creative Commons Attribution Share Alike 4.0 International
17 stars 2 forks source link

How to save spelling variant annotations in STAM? #27

Open kaldan007 opened 4 months ago

kaldan007 commented 4 months ago

Annotated data:

Let's go to party(1)<{E1}part{E2}parties>. We will have lots of fun.

Over here {E1} and {E2} refers to edition 1 and 2. the part and parties and spelling variant found in different editions. Party is the latest edition spelling. I have parser which parse the annotation in a dictionary where it saves:

{
   'span':[11,15],
   'spelling_varaint': {
              'E1':'part',
              'E2':'parties',
              'LE':'party'
         }
}

I able to save the span in target, but i am not able to save spelling variant in annotationdata. Kindly help me.

proycon commented 4 months ago

STAM is very open to however you want to model your data, so there is no single correct answer to this.

I'm not sure if I'm interpreting your use-case correctly, but you could make an annotation data set named spelling_variants with keys variant_text and edition (or variant_source?), and then do three annotations with a Text Selector on the same text span, each with annotation data like text = party, edition = LE etc... That way you can always independently add spelling variants from new editions as they become available and it's fairly easy to query variants given a specific edition.

If you're having doubts on how to create the data using Python and the stam library, see the part "Creating an annotation dataset (vocabulary)" in the tutorial notebook: https://nbviewer.org/github/annotation/stam-python/blob/master/tutorial.ipynb