Open GoogleCodeExporter opened 9 years ago
you mean this page? http://www.maltparser.org/mco/french_parser/fremalt.html
part-of-speech tags of the MElt tagger (Denis and Sagôt, 2009)
see: http://raweb.inria.fr/rapportsactivite/RA2009/alpage/uid89.html
paper: http://atoll.inria.fr/~sagot/pub/paclic09tagging.pdf
"In the original FTB, words are split into 13 main categories, themselves
divided into 34 subcategories.
The version of the treebank we used was obtained by converting subcategories
into a
tagset consisting of 28 tags, with a granularity that is intermediate between
categories and subcategories.
Basically, these tags enhance main categories with information on the mood of
verbs
and a few other lexical features. This expanded tagset has been shown to give
the best statistical
parsing results for French (Crabbé and Candito, 2008).2
2 This tagset is known as TREEBANK+ in (Crabbé and Candito, 2008), and since
then as CC (Candito et al., 2009)."
this page has some more references, including (Crabbé and Candito, 2008):
http://alpage.inria.fr/statgram/frdep/fr_stat_dep_parsing.html
The tagset with the 28 tags is on page 8 of this paper:
http://alpage.inria.fr/statgram/frdep/Publications/crabbecandi-taln2008-final.pd
f
Looking at this tagset, it seems something goes wrong in your POS tagset
extraction from the maltparser model
Original comment by eckle.kohler
on 12 Sep 2013 at 6:32
Original issue reported on code.google.com by
richard.eckart
on 11 Sep 2013 at 9:07