Tokenizer IndexError on strings with just stress markers

cldf / segments

Unicode Standard tokenization routines and orthography profile segmentation

Apache License 2.0

31 stars 13 forks source link

Closed kylebgorman closed 4 years ago

kylebgorman commented 4 years ago

To reproduce:

import segments
segments.Tokenizer()("ˈ", ipa=True)

Seems like on L316 you need to check that result isn't empty?

xrotwang commented 4 years ago