gchrupala / morfette

Supervised learning of morphology
BSD 2-Clause "Simplified" License
28 stars 5 forks source link

an option to eval without punctuation #11

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hi,

That would be great to have an option to skip some pos tags (such as PTB punct 
tag or PONCT as in the French treebank) for the evaluation. Idea here is to be 
able to replicate results one can get using Evalb pos tag accuracy.
 Of course I know you're very busy but also the possibility of ignoring difference between some pos would be nice (while i'm at in I could ask for the eval module to parse evalb parameter files and one unicorn for Christmas :)  )
seriously ignoring punctuation tags would be already a big step...

one could imagine that the option --ignoretags could take one argument such as 
"PUNCT , ; !"  for example

Best
Djamé

Original issue reported on code.google.com by djame.seddah@gmail.com on 27 Oct 2010 at 1:14

GoogleCodeExporter commented 9 years ago
Hi Djame,
This option is already implemented. The options listed for the eval command are:

eval:     evaluate morpho-tagging and lemmatization results
eval [OPTION...] TRAIN-FILE GOLD-FILE TEST-FILE 
    --ignore-case            ignore case for evaluation
    --baseline-file=PATH     path to baseline results
    --dict-file=PATH         path to optional dictionary
    --ignore-punctuation     ignore punctuation for evaluation
    --ignore-pos=POS-prefix  ignore POS starting with POS-prefix for evaluation

So you can ignore punctuation, or ignore arbitrary POS tags. File an issue if 
it's not working as expected.

Original comment by pitekus on 27 Oct 2010 at 9:49