homermultitext / hmt-utils

Utility library for editorial work specific to the standards of the Homer Multitext project
0 stars 0 forks source link

urn:cite:hmt:msA.250r, two successive punct mark confuses tokenizer #126

Open neelsmith opened 9 years ago

neelsmith commented 9 years ago

φωνῇ·, results in tokenizer not checking for evil impostor middle dot. Need to check not just last code point, but last code point AS LONG AS last code point is punct.

neelsmith commented 9 years ago

This is closely related to issue 118 and should be fixed in tandem with it

https://github.com/homermultitext/hmt-utils/issues/118