accuracy results (precision, recall, and f1) don't include statistics from empty parses

danielmorozoff / bubs-parser

Automatically exported from code.google.com/p/bubs-parser

0 stars 0 forks source link

What steps will reproduce the problem? 1. checkout RELEASE_1_0 branch, changeset 1035:977a87837605 (although even in a version of default branch, changeset 1071:c867467ddf39 with issue 3 fixed, the following behavior is still observed) 2. run the following two commands from bash: echo -e "(ROOT (SBAR (ADVP (DT The)) (WHADVP (WRB why)) (S (X (SYM x)) (X (FW w) (FW v) (NP (PRP you)) (NN tee)) (NP (PRP s)) (VP (VBP are) (S (INTJ (JJ queue) (NN pea) (INTJ (UH oh))) (INTJ (JJ n) (NN m) (VB el) (VB que) (VB jay)) (PRN (S (NP (PRP I)) (VP (VBP age) (ADVP (JJ gee)) (SBAR (IN if) (X (SYM e)))))) (X (SYM thee)) (VP (VBP see) (NP (NN b) (NN a.))))))))\n(ROOT (S (NP (DT This)) (VP (MD should) (VP (VB be) (ADJP (JJ easy)))) (. .)))"| build-dist/parse -g models/eng.sm6.gr.gz -fom models/eng.sm6.fom.gz -beamModel models/eng.sm6.bcm.gz -if Tree -O beamModelBias="-200,-200,-200,-200" What is the expected output? What do you see instead? observed: INFO: parser=CartesianProductHashMl fom=models/eng.sm6.fom.gz decode=ViterbiMax INFO: -g models/eng.sm6.gr.gz -fom models/eng.sm6.fom.gz -beamModel models/eng.sm6.bcm.gz -if Tree -O beamModelBias=-200,-200,-200,-200 () (ROOT (S (NP (DT This)) (VP (MD should) (VP (VB be) (ADJP (JJ easy)))) (. .))) INFO: numSentences=2 numFail=1 reparsedSentences=0 totalReparses=0 totalSeconds=0.227 cpuSeconds=0.227 avgSecondsPerSent=0.113 wordsPerSec=136.564 f1=100.00 prec=100.00 recall=100.00 expected: INFO: parser=CartesianProductHashMl fom=models/eng.sm6.fom.gz decode=ViterbiMax INFO: -g models/eng.sm6.gr.gz -fom models/eng.sm6.fom.gz -beamModel models/eng.sm6.bcm.gz -if Tree -O beamModelBias=-200,-200,-200,-200 () (ROOT (S (NP (DT This)) (VP (MD should) (VP (VB be) (ADJP (JJ easy)))) (. .))) INFO: numSentences=2 numFail=1 reparsedSentences=0 totalReparses=0 totalSeconds=0.227 cpuSeconds=0.227 avgSecondsPerSent=0.113 wordsPerSec=136.564 f1=20.00 prec=20.00 recall=20.00 explanation: I think the aggregate f1 etc. should be lower since I failed to get any correct constituents on the failed parses. There were 24 gold bracket-pairs in the gold tree corresponding to the failed parse and 6 in the other gold tree. precision = 6/30 = , recall = .2 6/30 = .2, f1 = 2*precison*recall/(precision + recall) = .2 What version of the product are you using? On what operating system? (see above for the offending version). I'm running on 64 bit Linux Please provide any additional information below.

The default in Collins evalb bracket-evaluation program is to ignore empty 
parses, and BUBS reimplementation of evalb does the same. Revision 8e75def adds 
the a configuration option (-O evalParseFailures=true) which includes the empty 
/ failed parses in bracket evaluation, penalizing recall.

Note: Since this defect was filed, other changes and bug-fixes eliminated the 
parse failure from the reported command-line. The examples below trigger that 
failure again and demonstrate the effect of -O evalParseFailures=true.

$ echo -e "(ROOT (SBAR (ADVP (DT The)) (WHADVP (WRB why)) (S (X (SYM x)) (X (FW 
w) (FW v) (NP (PRP you)) (NN tee)) (NP (PRP s)) (VP (VBP are) (S (INTJ (JJ 
queue) (NN pea) (INTJ (UH oh))) (INTJ (JJ n) (NN m) (VB el) (VB que) (VB jay)) 
(PRN (S (NP (PRP I)) (VP (VBP age) (ADVP (JJ gee)) (SBAR (IN if) (X (SYM 
e)))))) (X (SYM thee)) (VP (VBP see) (NP (NN b) (NN a.))))))))\n(ROOT (S (NP 
(DT This)) (VP (MD should) (VP (VB be) (ADJP (JJ easy)))) (. .)))"| 
build-dist/parse -g models/eng.sm6.gr.gz -fom models/eng.sm6.fom.gz -beamModel 
models/eng.sm6.bcm.gz -if Tree -O beamModelBias="-200,-200,-200,-200" -O 
maxLocalDelta=5
INFO: parser=CartesianProductHashMl fom=models/eng.sm6.fom.gz decode=ViterbiMax
INFO: -g models/eng.sm6.gr.gz -fom models/eng.sm6.fom.gz -beamModel 
models/eng.sm6.bcm.gz -if Tree -O beamModelBias=-200,-200,-200,-200 -O 
maxLocalDelta=5
()
(ROOT (S (NP (DT This)) (VP (MD should) (VP (VB be) (ADJP (JJ easy)))) (. .)))
INFO: numSentences=2 numFail=1 reparsedSentences=0 totalReparses=0 
totalSeconds=0.165 cpuSeconds=0.165 avgSecondsPerSent=0.083 wordsPerSec=187.879 
f1=100.00 prec=100.00 recall=100.00

$ echo -e "(ROOT (SBAR (ADVP (DT The)) (WHADVP (WRB why)) (S (X (SYM x)) (X (FW 
w) (FW v) (NP (PRP you)) (NN tee)) (NP (PRP s)) (VP (VBP are) (S (INTJ (JJ 
queue) (NN pea) (INTJ (UH oh))) (INTJ (JJ n) (NN m) (VB el) (VB que) (VB jay)) 
(PRN (S (NP (PRP I)) (VP (VBP age) (ADVP (JJ gee)) (SBAR (IN if) (X (SYM 
e)))))) (X (SYM thee)) (VP (VBP see) (NP (NN b) (NN a.))))))))\n(ROOT (S (NP 
(DT This)) (VP (MD should) (VP (VB be) (ADJP (JJ easy)))) (. .)))"| 
build-dist/parse -g models/eng.sm6.gr.gz -fom models/eng.sm6.fom.gz -beamModel 
models/eng.sm6.bcm.gz -if Tree -O beamModelBias="-200,-200,-200,-200" -O 
maxLocalDelta=5 -O evalParseFailures=true
INFO: parser=CartesianProductHashMl fom=models/eng.sm6.fom.gz decode=ViterbiMax
INFO: -g models/eng.sm6.gr.gz -fom models/eng.sm6.fom.gz -beamModel 
models/eng.sm6.bcm.gz -if Tree -O beamModelBias=-200,-200,-200,-200 -O 
maxLocalDelta=5 -O evalParseFailures=true
()
(ROOT (S (NP (DT This)) (VP (MD should) (VP (VB be) (ADJP (JJ easy)))) (. .)))
INFO: numSentences=2 numFail=1 reparsedSentences=0 totalReparses=0 
totalSeconds=0.212 cpuSeconds=0.212 avgSecondsPerSent=0.106 wordsPerSec=146.226 
f1=33.33 prec=100.00 recall=20.00

Original comment by aaron.du...@gmail.com on 30 May 2013 at 7:26

Changed state: Fixed

danielmorozoff / bubs-parser

accuracy results (precision, recall, and f1) don't include statistics from empty parses #5