ArneBinder / pytorch-ie

PyTorch-IE: State-of-the-art Information Extraction in PyTorch
MIT License
75 stars 7 forks source link

implement `merge_annotations_from_documents()` #428

Closed ArneBinder closed 1 month ago

ArneBinder commented 1 month ago

This PR implements utils.document.merge_annotations_from_documents(). From the doscstring:

"""Merge annotations from multiple documents into a single document. Optionally, store the source
names for all annotations / predictions in the metadata at key metadata_key_source_annotations
/ metadata_key_source_predictions, respectively.

Note that this will remove any annotation duplicates.

Args:
    documents: A dictionary mapping document source (e.g. dataset names) to documents.
    metadata_key_source_annotations: If not None, the key in the metadata where the source names
        for the (gold) annotations are stored.
    metadata_key_source_predictions: If not None, the key in the metadata where the source names
        for the predictions are stored.

Returns:
    The document with merged annotations.
"""

Notes: