Closed elshize closed 1 year ago
Base: 92.87% // Head: 92.84% // Decreases project coverage by -0.03%
:warning:
Coverage data is based on head (
5e529fa
) compared to base (78dbc6b
). Patch coverage: 100.00% of modified lines in pull request are covered.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
Implements a whitespace tokenizer next to the old term tokenizer. The old tokenizer is renamed to EnglishTokenizer (as it contains English-specific rules such as possessives), and both tokenizers are now organized into a class hierarchy with a common virtual interface, so that they could be used interchangeably in the future, as well so that perhaps more tokenizers can be implemented.
Related to #494