Add an additional segmentation level for phrases

So far, we distinguish morphemes and phonemes in the segmentation. However, the data occasionally also shows word-boundaries, as in this entry:

693 HPUN "six": kɔ́ŋyɔ́ŋ màŋtàŋ | 431-1

I suggest using the underscore, which is already recognized as a separation marker for word boundaries in these cases and write it internally as:

k ɔ́ ◦ ŋ y ɔ́ ŋ _ m à ŋ ◦ t à ŋ

instead of

k ɔ́ ◦ ŋ y ɔ́ ŋ ◦ m à ŋ ◦ t à ŋ

Since there are only spurious cases where this will be needed, writing an algorithm for this level of segmentation is not feasible. However, we should keep it in mind when dealing with the data later on.

digling / burmish

Add an additional segmentation level for phrases #19