This adds Brat Serializer which writes model predictions to annotation files (.ann) in Brat format. It requires a layers parameter to specify the annotation layers to serialize. For now, it supports layers containing LabeledSpan, LabeledMultiSpan, and BinaryRelation annotations. If a gold_label_prefix is provided, the gold annotations are serialized with the given prefix. Otherwise, only the predicted annotations are serialized. A document_processor can be provided to process documents before serialization.
Usage
from src.serializer import BratSerializer
from pie_modules.annotations import BinaryRelation, LabeledSpan
from pie_modules.documents import TextDocumentWithLabeledSpansAndBinaryRelations
# create an example document
document = TextDocumentWithLabeledSpansAndBinaryRelations(
text="Harry lives in Berlin. He works at DFKI.", id="tmp_1"
)
# create annotations
harry = LabeledSpan(start=0, end=5, label="PERSON") # Harry
berlin = LabeledSpan(start=15, end=21, label="LOCATION") # Berlin
# add annotations to the document
document.labeled_spans.predictions.extend([harry, berlin])
document.binary_relations.predictions.append(BinaryRelation(head=harry,tail=berlin,label="lives_in"))
serializer = BratSerializer(path='/tmp', layers=["labeled_spans","binary_relations"])
metadata = serializer(documents=[document])
"""
Saved at os.path.join(metadata['path'], f"{document.id}.ann") with following content
T0 LOCATION 15 21 Berlin
T1 PERSON 0 5 Harry
R0 lives_in Arg1:T1 Arg2:T0
"""
Note: This PR updates pie-modules to v0.10.6 with fixed LabeledMultiSpan (see here)
This adds Brat Serializer which writes model predictions to annotation files (.ann) in Brat format. It requires a
layers
parameter to specify the annotation layers to serialize. For now, it supports layers containing LabeledSpan, LabeledMultiSpan, and BinaryRelation annotations. If agold_label_prefix
is provided, the gold annotations are serialized with the given prefix. Otherwise, only the predicted annotations are serialized. Adocument_processor
can be provided to process documents before serialization.Usage
Note: This PR updates pie-modules to v0.10.6 with fixed
LabeledMultiSpan
(see here)