Closed HugoSousa closed 8 years ago
Hi,
Thanks for raising this issue. This is just to acknowledge that I'm on it. This is part of my PhD code, that really needs attention. I will get back to you with some examples later on. I suspect part of the code is missing here.
Sasho
Hi,
your example should be working now. I had forgotten to rename the attribute postag_list
to tags
in some cases. PyCharm's refactoring is sometimes very sneaky. I've also added some examples on how to use the agreement module. Hope that helps. Let me know if there is anything else.
Sasho
Hey,
I appreciate the fast reply. And thanks for fixing and improving the project. As an improvement suggestion, I think it would be nice to have this in the README in order to understand the statistics metrics better.
However, I'm now questioning about the logic of the program. I'm running the sample with doc
and doc2
pointing to the same file. So, it should be expected to have precision
and recall
of 1
. Right?
It's not the case, though. The following code:
from bratutils import agreement as a
doc = a.Document('res/samples/A/data-sample-1.ann')
doc2 = a.Document('res/samples/A/data-sample-1.ann')
doc.make_gold()
statistics = doc2.compare_to_gold(doc)
print statistics
Results in the following statistics:
-------------------MUC-Table--------------------
------------------------------------------------
pos:158
act:158
cor:130
par:0
inc:28
mis:0
spu:0
------------------------------------------------
pre:0.822784810127
rec:0.822784810127
fsc:0.822784810127
------------------------------------------------
und:0.0
ovg:0.0
sub:0.177215189873
------------------------------------------------
bor:158
ibo:0
------------------------------------------------
------------------------------------------------
I guess the 28 incorrect
counter is not being correctly calculated.
That's the moment I take a real look at my PhD code and I pull my hair. I'll need some time to fix this properly.
Ok, thanks.
I'll also try to give a look at the source code and see if I can help with it.
However, there should be some easy and trustable way to test the results (with smaller samples with manual calculations, I guess).
Yeah, part of it should really be writing tests for it.
Hello. I need to compare automatic annotations performed by a software application with manual annotations (in brat standoff format), and this seems to be a nice tool to use.
While testing it and trying to understand the source code, I tried the following small sample code
However, on the execution of
compare_to_gold
function, it says thatDocument instance has no attribute 'postag_list'
, which is true, but I don't understand where this comes from either.Am I missing something? Could you eventually post a small working example for comparing two
.ann
files? I'd appreciate that.Thanks.