kscanne / gaelg

NLP resources for Manx Gaelic, mainly in support of the gv2ga MT engine
GNU General Public License v3.0
3 stars 1 forks source link

Tokenization of copula forms #2

Open kscanne opened 3 years ago

kscanne commented 3 years ago

Probably want to do more splitting off of copulas for cross-lingual consistency: Adjectives s'messey, s'odjey, sloo, etc. and also stuff like saillym, shegin, shione, shynney (PM p140ff)

kscanne commented 3 years ago

As part of this should distinguish "she" from just a copula (which might mean changing the lemma with UD project for the validator script).