UAlbertaALTLab / crk-db

Managing the Plains Cree dictionary database
https://itwewina.altlab.app/
GNU General Public License v3.0
0 stars 3 forks source link

matching MD <ts> cases with CW <c> or <ci> cases #3

Closed aarppe closed 3 years ago

aarppe commented 4 years ago

Besides the straight-forward linking of MD and CW content by undoing vowel length for CW dictionary entries and converting <ch> to <c> and turning prefix-spaces to hyphens for MD entries, there are other orthographical divergences that are not as systematic and thus harder to match. One of the more frequent ones is the MD convention of using <ts> for either <c> or <ci> in CW. So, while this can be automated for a large part, it will need manual final fixing and validation.

An example MD mitsow 'He eats.' which is not matched with CW mîcisow.

A similar but less observed variation is MD using a short <i> for an unstressed short vowel such as <a> in CW, e.g. MD wapimew is not matched with CW wâpamêw - there might be other CW short vowels rendered as <i> in MD as well.

dwhieb commented 3 years ago

These notes have been incorporated into #5. Closing this issue.