Open savkov opened 5 years ago
Hi Savkov, thanks for addressing this issue. I'm not an expert coder at all, so personally I can't help very much. However, I colleague of mine is working also with Brat, so I'll comment with him the issue, he may have some ideas. Let you know asap.
Hi Sasho, Thanks for sharing your thoughts. I agree that thinking of agreement evaluation can be quickly tentacular. I am not sure I can be helpful for developping the evaluation of relations. But let me add my input.
First, you say "evaluating the agreement is really quite easy -- F1-score where each triple is treated as a unique annotation" Let's go like this for a start. but quickly, we may want to distinguish exact-same relations from relations pointing on the same chuncks but with a different type. Then relations with one common argument instead of two.
Another point : you say " in many cases the arguments are not necessarily predetermined", defining arguments as "chunks or some other pre-annotated spans". How a relation can point to stg else ? please clarify
Sorry for my naive remarks and questions. Thanks for your work Jean-Philippe Goldman
j:-P
On Thu, Jun 27, 2019 at 9:07 PM Sasho Savkov notifications@github.com wrote:
Support for relations has been long asked for but I've been reluctant to implement it because the code is not my best and am reluctant to go back into the heavy logic. However, I just worked on getting the parsing function to handle gracefully all types and it looks like relations can be implemented in a way that is self-contained and probably quite straightforward. So I'll lay out what I want to do here and ask for feedback.
Relations are effectively triples of two arguments and a relation type. Assuming that the possible arguments are predetermined, e.g. arguments can only be tokens, or chunks or some other pre-annotated spans, evaluating the agreement is really quite easy -- F1-score where each triple is treated as a unique annotation. I can probably copy lost of the code straight from bioeval https://github.com/savkov/bioeval/.
I haven't thought about this for too long but using F1-score seems to be a bit of a copout here. The probability of a random assignment of a relation is not infinitely small. So maybe kappa can be implemented here instead.
Additionally, in many cases the arguments are not necessarily predetermined, so that would be quite hard to evaluate at the same time and honestly I have no idea how to do it ATM.
So I'm looking for some input here. Would be nice to hear what you think.
cc @jeanphilippegoldman https://github.com/jeanphilippegoldman @soluna1 https://github.com/soluna1
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/savkov/bratutils/issues/17?email_source=notifications&email_token=ADQ5ZKYYZ4D7YZMAF3DAD5DP4UFX5A5CNFSM4H37KESKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G4FABHA, or mute the thread https://github.com/notifications/unsubscribe-auth/ADQ5ZKY3BOLTRP5NMKYSBULP4UFX5ANCNFSM4H37KESA .
Support for relations has been long asked for but I've been reluctant to implement it because the code is not my best and am reluctant to go back into the heavy logic. However, I just worked on getting the parsing function to handle gracefully all types and it looks like relations can be implemented in a way that is self-contained and probably quite straightforward. So I'll lay out what I want to do here and ask for feedback.
Relations are effectively triples of two arguments and a relation type. Assuming that the possible arguments are predetermined, e.g. arguments can only be tokens, or chunks or some other pre-annotated spans, evaluating the agreement is really quite easy -- F1-score where each triple is treated as a unique annotation. I can probably copy lost of the code straight from bioeval.
I haven't thought about this for too long but using F1-score seems to be a bit of a copout here. The probability of a random assignment of a relation is not infinitely small. So maybe kappa can be implemented here instead.
Additionally, in many cases the arguments are not necessarily predetermined, so that would be quite hard to evaluate at the same time and honestly I have no idea how to do it ATM.
So I'm looking for some input here. Would be nice to hear what you think.
cc @jeanphilippegoldman @soluna1