digling / burmish

LingPy plugin for handling a specific dataset
GNU General Public License v2.0
1 stars 1 forks source link

Add an additional segmentation level for phrases #19

Closed LinguList closed 8 years ago

LinguList commented 10 years ago

So far, we distinguish morphemes and phonemes in the segmentation. However, the data occasionally also shows word-boundaries, as in this entry:

I suggest using the underscore, which is already recognized as a separation marker for word boundaries in these cases and write it internally as:

instead of

Since there are only spurious cases where this will be needed, writing an algorithm for this level of segmentation is not feasible. However, we should keep it in mind when dealing with the data later on.

LinguList commented 8 years ago

no problem now, as we can have _ and + as indicatedin ortho-profile.