bowersd / textAnalysis

Tools for applying linguistic analyzers to text, and checking the output. The end goal is producing glossaries, lemmatizations, or interlinearizations.
GNU General Public License v3.0
2 stars 1 forks source link

tokenizer punctuation #10

Open bowersd opened 1 year ago

bowersd commented 1 year ago

‘ (right/left curly single quote) is not split off from words when tokenizing

bowersd commented 1 year ago

at least it wasn't on 'zaam in the Niibaakhom story