savkov / bratutils

A collection of utilities for manipulating data and calculating inter-annotator agreement in brat annotation files.
MIT License
29 stars 12 forks source link

first step to handle discontinuated/split entities #22

Open jeanphilippegoldman opened 5 years ago

jeanphilippegoldman commented 5 years ago

I added self.frag to Annotation constructor as a list of fragments, each of them being (start_idx,end_idx) pair. and kept self.start_idx (resp. self.end_idx) as the start (resp. end) of first (resp. last) fragment. In most of the case, there's only one fragment.

Some comparisons in agreement.py should be a bit better handled (is_contained_by, contains_ann,, is_partial_to, overlaps_with, has_partial_candidate, is_right_from, in_range) but at list , discont. entities are handled.

Maybe some additionnal testing could be welcomed

Also, I added an exception to skip lines of .ann files starting with an A (for attributes)