proycon / gecco

Generic Environment for Context-Aware Correction of Orthography
GNU General Public License v3.0
22 stars 7 forks source link

Punctuation/recase module performs badly... has to be redesigned #9

Open proycon opened 8 years ago

proycon commented 8 years ago

New module is implemented but has to be tested more thoroughly and parameters have to be tweaked. Initial evaluation of Valkuil on CITO data show recasing and deletions are not/hardly working and precision/recall of missing punctuation insertion is still very low.

OVERALL RESULTS
=================
 Documents                                  :  520
 Total number of corrections in output      :  4390
 Total number of corrections in reference   :  10831
 Matching output corrections (tp)           :  1926
 Missed output corrections (fp)             :  2464
 Missed reference corrections (fn)          :  9118
 Virtual total (tp+fn)                      :  11044
 Precision (micro)                          :  0.44
 Recall (micro)                             :  0.17
 F1-score (micro)                           :  0.25

Aggregated corrections when they are on the same words:
 Aggregated average corrections in output              :  1.04
 Total number of aggregated corrections in output      :  4072
 Total number of aggregated corrections in reference   :  10831
 Matching output aggregated corrections (tp)           :  1713
 Missed output aggregated corrections (fp)             :  2359
 Missed reference aggregated corrections (fn)          :  18236
 Virtual total (tp+fn)                                 :  19949
 Aggregated precision (micro)                          :  0.42
 Aggregated recall (micro)                             :  0.09
 Aggregated F1-score (micro)                           :  0.14

PER-MODULE RESULTS
====================
Precision for confusible_de_het :  0.45     (89/196)
Precision for confusible_deze_dit :  0.37     (7/19)
Precision for confusible_hard_hart :  1.0     (1/1)
Precision for confusible_hun_zij :  0.48     (10/21)
Precision for confusible_licht_ligt :  1.0     (3/3)
Precision for confusible_me_mijn :  0.8     (53/66)
Precision for confusible_u_uw :  0.85     (45/53)
Precision for confusible_word_wordt :  0.9     (112/125)
Precision for confusible_zei_zij :  0.8     (4/5)
Precision for confusiblesuffix_d_dt :  0.69     (9/13)
Precision for errorlist :  0.71     (256/359)
Precision for hunspell :  0.5     (957/1929)
Precision for puncrecase :  0.11     (124/1114)
Precision for runon :  0.51     (71/138)
Precision for splits :  0.6     (185/310)

PER-CLASS RESULTS
====================
archaic :  P=0  R=0.0   F=0.0
capitalizationerror :  P=0.0    R=0.0   F=0.0
confusion :  P=0.66     R=0.18  F=0.28
missingpunctuation :  P=0.11    R=0.07  F=0.08
missingword :  P=0      R=0.0   F=0.0
nonworderror :  P=0.52  R=0.55  F=0.54
redundantpunctuation :  P=0     R=0.0   F=0.0
redundantword :  P=0    R=0.0   F=0.0
runonerror :  P=0.56    R=0.45  F=0.5
spliterror :  P=0.6     R=0.22  F=0.32
uncertain :  P=0        R=0.0   F=0.0

REFERENCE CLASS DISTRIBUTION
================================
archaic :  1 0.0%
capitalizationerror :  2374 21.9%
confusion :  1832 16.9%
missingpunctuation :  1849 17.1%
missingword :  960 8.9%
nonworderror :  1747 16.1%
redundantpunctuation :  306 2.8%
redundantword :  490 4.5%
runonerror :  333 3.1%
spliterror :  831 7.7%
uncertain :  108 1.0%

OUTPUT CLASS DISTRIBUTION
================================
capitalizationerror :  5 0.1%
confusion :  507 11.5%
missingpunctuation :  1114 25.4%
nonworderror :  2184 49.7%
runonerror :  270 6.2%
spliterror :  310 7.1%