alpheios-project / pyperseus-treebank

2 stars 1 forks source link

UD POS #1

Open epageperron opened 6 years ago

epageperron commented 6 years ago

Hi Thibault,

Here is the list :

a   ADJ
c   CCONJ
d   ADV
e   INTJ
m   NUM
n   NOUN
p   PRON
r   ADP
t   VERB
u   PUNCT
v   VERB

I found the conversion here: http://universaldependencies.org/tagset-conversion/la-conll-uposf.html

Hope it can be useful

PonteIneptique commented 6 years ago

Thanks a lot for the issue, looking forward to implement it

epageperron commented 6 years ago

First column based on: https://github.com/alpheios-project/xml_ctl_files/blob/63b2abbdd476663cba201bdc083dbabaca0c461a/xslt/tags/1.0alpha1/aldt-util.xsl

alpheidos   la::conll   UD
a   a   ADJ
c   c   CCONJ
d   d   ADV
e   e   INTJ
g       PART
i   i   INTJ
l       DET 
m   m   NUM
n   n   NOUN
p   p   PRON
r   r   ADP
t   t   VERB
u   u   PUNCT
v   v   VERB
x       X
PonteIneptique commented 6 years ago

Thanks, I'll think about a way to do this export in a clean fashion

PonteIneptique commented 6 years ago

Hey @epageperron , I just added some support for that :) Check the readme ! ;)

epageperron commented 6 years ago

Fantastic! I think, if you like, you can simplify things by employing the UD tags in the UPOSTAG column and the alpheidos/la:conll tags in the XPOSTAG column, without giving the option of generating two different outputs. The Alpheidos tagset looks like an extended version of the la:conll tagset; because they are the tags used by specialists of latin, it should probably go in the XPOSTAG CoNLL-U column, and the Universal Dependencies (UD) tags are to go in the UPOSTAG column. See the CoNLL-U columns description at http://universaldependencies.org/format.html