ftyers / docs

Universal Dependencies online documentation
http://universaldependencies.github.io/docs/
Apache License 2.0
1 stars 0 forks source link

[ud] Release v1.3: pos mapping: abbr => ? #23

Open makazhan opened 8 years ago

makazhan commented 8 years ago

pos:

depends...

Tokenization:

always joint?, i.e. keep punctuation within the token.

Dependency:

depends... currently used in rev. freq. order:

     28 nmod
      9 name
      6 nummod
      6 cmpnd
      2 root
      2 advmod
      1 appos
      1 amod
ftyers commented 8 years ago

Perhaps something like:

А. abbr => PROPN
б.з.б.   abbr => ADV 
Б.з.б.   abbr => ADV
Ғ. abbr => PROPN
долл.   abbr => NOUN
драм.   abbr => NOUN
дүниежүзілік    abbr => ADJ
ж. abbr => NOUN
КСРО    abbr => PROPN
М. abbr => PROPN
млн. abbr => NOUN/NUM 
млрд.   abbr => NOUN/NUM 
мыс. abbr => NOUN 
Р. abbr => PROPN
Т. abbr => PROPN
т.б.  abbr => ADV
makazhan commented 8 years ago

cool! thanks!

дүниежүзілік abbr => ADJ

sure it's not adj? Seems like adjective to me

ftyers commented 8 years ago

It's definitely ADJ but i think in the original corpus it was probably abbreviated to дүниежүз. or something like that. It's safe to change to ADJ anyway. Btw, if you don't have commit access, then make yourself SourceForge account and send us the username and we'll add you.