dkpro / dkpro-core

Collection of software components for natural language processing (NLP) based on the Apache UIMA framework.
https://dkpro.github.io/dkpro-core
Other
196 stars 67 forks source link

Support additional CoNLL-U metadata #1438

Open reckart opened 4 years ago

reckart commented 4 years ago

Add support for:

Example source: https://raw.githubusercontent.com/UniversalDependencies/UD_Ukrainian-IU/master/uk_iu-ud-dev.conllu

``` # doc_title = Сад Гетсиманський # newdoc id = 028g # newpar id = 02tb # sent_id = 02to # text = Дідусь, той що атестував, посміхнувся й спитав: # translit = Diduś, toj ščo atestuvav, posmichnuvśа j spytav: 1 Дідусь дідусь NOUN Ncmsny Animacy=Anim|Case=Nom|Gender=Masc|Number=Sing 7 nsubj 7:nsubj|9:nsubj Id=02tp|LTranslit=diduś|SpaceAfter=No|Translit=Diduś 2 , , PUNCT U _ 3 punct 3:punct Id=02tq|LTranslit=,|Translit=, 3 той той DET Pd--m-sna Case=Nom|Gender=Masc|Number=Sing|PronType=Dem 7 dislocated 5:nsubj:rel|7:dislocated|9:dislocated Id=02tr|LTranslit=toj|Translit=toj 4 що що SCONJ Css _ 5 mark 5:mark Id=02ts|LTranslit=ščo|Translit=ščo 5 атестував атестувати VERB Vmpis-sm Aspect=Imp|Gender=Masc|Mood=Ind|Number=Sing|Tense=Past|VerbForm=Fin 3 acl:relcl 3:acl:relcl Id=02tt|LTranslit=atestuvaty|SpaceAfter=No|Translit=atestuvav 6 , , PUNCT U _ 5 punct 5:punct Id=02tu|LTranslit=,|Translit=, 7 посміхнувся посміхнутися VERB Vmeis-sm Aspect=Perf|Gender=Masc|Mood=Ind|Number=Sing|Tense=Past|VerbForm=Fin 0 root 0:root Id=02tv|LTranslit=posmichnutyśа|Translit=posmichnuvśа 8 й й CCONJ Ccs _ 9 cc 9:cc Id=02tw|LTranslit=j|Translit=j 9 спитав спитати VERB Vmeis-sm Aspect=Perf|Gender=Masc|Mood=Ind|Number=Sing|Tense=Past|VerbForm=Fin 7 conj 0:root|7:conj Id=02tx|LTranslit=spytaty|SpaceAfter=No|Translit=spytav 10 : : PUNCT U _ 7 punct 7:punct Id=02ty|LTranslit=:|Translit=: ```