This PR replaces a lot of functionality with methods from pytorch_ie.DatasetDict and pie_utils.document.processors because these are better integrated and well tested.
This also:
removes this functionality because it was quite special:
remove_overlapping_entities()
trim_spans()
moves the following to src.pipeline.ner_re_pipeline because it is only used there:
clear_annotation_layers()
move_annotations_from_predictions()
move_annotations_to_predictions()
add_annotations_from_other_documents()
moves src.utils.dataset.process_dataset() to src.utils.config.execute_pipeline() (because it can be used for any pipelined processing, not just dataset processing)
This PR replaces a lot of functionality with methods from
pytorch_ie.DatasetDict
andpie_utils.document.processors
because these are better integrated and well tested.This also:
remove_overlapping_entities()
trim_spans()
src.pipeline.ner_re_pipeline
because it is only used there:clear_annotation_layers()
move_annotations_from_predictions()
move_annotations_to_predictions()
add_annotations_from_other_documents()
src.utils.dataset.process_dataset()
tosrc.utils.config.execute_pipeline()
(because it can be used for any pipelined processing, not just dataset processing)Requires
pytorch-ie 0.17.1
andpie-utils v0.5.0
.Also see see https://github.com/ArneBinder/pie-document-level/pull/88 for the respective PR in that repo