savkov / bratutils

A collection of utilities for manipulating data and calculating inter-annotator agreement in brat annotation files.
MIT License
29 stars 12 forks source link

Discontinuous Annotations #5

Open aellenhicks opened 7 years ago

aellenhicks commented 7 years ago

I would like to use this in a corpus annotation project that uses discontinuous annotations, but I receive the following error.

Traceback (most recent call last): File "vso-inter-annotator.py", line 5, in doc = a.DocumentCollection('data/BoireAnnotations/VSO_Hypertension1/') File "build/bdist.macosx-10.10-intel/egg/bratutils/agreement.py", line 834, in init File "build/bdist.macosx-10.10-intel/egg/bratutils/agreement.py", line 654, in init File "build/bdist.macosx-10.10-intel/egg/bratutils/agreement.py", line 292, in init File "build/bdist.macosx-10.10-intel/egg/bratutils/agreement.py", line 301, in _parse_annotation ValueError: invalid literal for int() with base 10: '6419;6435'

vso-inter-annotator.py contains the following:

VSO inter-rater agreement using BRAT utils

from bratutils import agreement as a

doc = a.DocumentCollection('data/BoireAnnotations/VSO_Hypertension1/') doc2 = a.DocumentCollection('data/HerringAnnotations/VSO_Hypertension1/')

doc.make_gold() statistics = doc2.compare_to_gold(doc)

print statistics

Here is the annotation file that is causing the error.

T1 VSO_0000005 3395 3407 182/107 mmHg T2 VSO_0000005 4300 4312 200/100 mmHg T3 VSO_0000008 4254 4260 36.8°C T4 VSO_0000005 6518 6529 160/80 mmHg T5 VSO_0000005 15833 15844 170/80 mmHg T6 VSO_0000038 16385 16408 Systolic blood pressure T7 VSO_0000005 16438 16446 200 mmHg T8 VSO_0000005 16867 16878 135/95 mmHg T9 VSO_0000005 16959 16971 160/100 mmHg T10 VSO_0000005 17659 17671 220/120 mmHg T11 VSO_0000005 18143 18154 135/95 mmHg T12 VSO_0000004 3370 3384 blood pressure T13 VSO_0000007 4239 4250 temperature T14 VSO_0000004 4282 4296 blood pressure T15 VSO_0000004 6486 6500 Blood pressure T16 VSO_0000004 15802 15816 Blood pressure T17 VSO_0000004 16826 16840 Blood pressure T18 VSO_0000004 16941 16955 blood pressure T19 VSO_0000004 17624 17638 Blood pressure T20 GO_0008217 17713 17738 Blood pressure normalized T21 VSO_0000004 18125 18139 blood pressure T23 VSO_0000030 4341 4360 63 beats per minute T24 GO_0008217 6405 6419;6435 6442 blood pressure control T31 GO_0008217 16046 16060;16072 16079 blood pressure control T33 VSO_0000006 16826 16844;16855 16863 Blood pressure was measured T34 GO_0008217 17015 17029;17041 17048 blood pressure control T38 VSO_0000029 4314 4324 Heart rate T39 VSO_0000004 6147 6161 blood pressure T41 GO_0008217 6486 6514 Blood pressure was decreased T43 VSO_0000006 15802 15829 Blood pressure was measured T22 VSO_0000004 6405 6419 blood pressure T25 VSO_0000004 16046 16060 blood pressure T26 VSO_0000004 17015 17029 blood pressure T27 VSO_0000004 17713 17727 Blood pressure T28 VSO_0000004 18517 18531 blood pressure

Thank you

savkov commented 7 years ago

Hi,

thanks for your feedback! Discontinuous annotations as well as relations are expressed in a particular way that is currently not supported. However, as this is now closer to my work I intend to develop those in the near future. Stay tuned :)

Sasho

aellenhicks commented 7 years ago

Hi,

Thanks for letting me know.?

Amanda


From: Sasho Savkov notifications@github.com Sent: Tuesday, March 14, 2017 6:37 PM To: savkov/BratUtils Cc: aellenhicks; Author Subject: Re: [savkov/BratUtils] Discontinuous Annotations (#5)

Hi,

thanks for your feedback! Discontinuous annotations as well as relations are expressed in a particular way that is currently not supported. However, as this is now closer to my work I intend to develop those in the near future. Stay tuned :)

Sasho

- You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/savkov/BratUtils/issues/5#issuecomment-286583495, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHyM2RQvy5nrpcd5TMDGZgUa-k3ILOIRks5rlxa2gaJpZM4MXPS3.

tobiasoleary commented 4 years ago

jeanphilippgoldman's fork seems to fix this issue by adding support for fragments. Might do a pull request from his master and just leave out the commit where he turns the off the DEBUG logger to match your default behavior.