ahmetaa / zemberek-nlp

NLP tools for Turkish.
Other
1.14k stars 207 forks source link

Implementing IndexedString to keep track of changes to a string during normalization #224

Closed mrmutator closed 11 months ago

mrmutator commented 5 years ago

In some cases it is useful to know what the original word form of a word or substring was after a string is normalized. This means that the changes that are made during normalization need to be tracked. This PR does exactly that and allows users of the API to extract the original substring for indices in the normalized string.