PerseusDL / treebank_data

Perseus Treebank Data
70 stars 45 forks source link

transformed data in 1.6 incorrectly converted some lemmas #3

Open balmas opened 9 years ago

balmas commented 9 years ago

Reported by @rgorman:

the transformation did not take into account that the lemmas for some punctuation were written in English. Thus we find this kind of thing:

<word id="3" cid="36272653" form="," lemma="ξομμα1" postag="u--------" head="5" relation="AuxZ" cite=""/>

Note "ξομμα1" for "comma1".