Problems regarding segmentation and tokenization of sounds/segments

glottobank / tukano

Repository for computer-guided reconstruction with Jena wordlist standard for Tukano language data

GNU General Public License v2.0

1 stars 0 forks source link

Entry ID 977 in *PT has a missing initial ~~(but in the IPA column it is there. Looks like it got deleted)~~

Entry ID 107, 108, 109, 111, 113, 115, 1120 in *PT has a split segment "t j?", should have been "tj?" as we previously aligned it.

Entry ID 1127 in *PT has a split segment "k ?", it should have been "k?" as we previously aligned it.

Same as in 76, 1269, 1316, 1338 *PT where there is "t j" and it should be "tj"

Same as in 227, 291 *PT where there is "p ?" and it should be "p?"

Same as in 306, 1077 *PT where there is "t ?" and it should be "t?"

Same as in 1077 *PT where there is "t ?" and it should be "t?"

Same as in 660, 1019 *PT where there is "k k" and it should be "kk"

etc.

glottobank / tukano

Problems regarding segmentation and tokenization of sounds/segments #16