cldf / segments

Unicode Standard tokenization routines and orthography profile segmentation
Apache License 2.0
31 stars 13 forks source link

Tokenizer IndexError on strings with just stress markers #46

Closed kylebgorman closed 4 years ago

kylebgorman commented 4 years ago

To reproduce:

import segments
segments.Tokenizer()("ˈ", ipa=True)

Seems like on L316 you need to check that result isn't empty?

xrotwang commented 4 years ago

Fixed in https://github.com/cldf/segments/commit/d74a23c53bd9f25f5a90f45a5068e6284d4a7708