rockymadden / stringmetric

:dart: String metrics and phonetic algorithms for Scala (e.g. Dice/Sorensen, Hamming, Jaccard, Jaro, Jaro-Winkler, Levenshtein, Metaphone, N-Gram, NYSIIS, Overlap, Ratcliff/Obershelp, Refined NYSIIS, Refined Soundex, Soundex, Weighted Levenshtein).
https://rockymadden.com/stringmetric/
486 stars 81 forks source link

Issue #24 fix character classes #25

Open zoltanmaric opened 8 years ago

zoltanmaric commented 8 years ago

Fixes #24. Apart from including the missing characters (z, Z, and 9) in their respective classes and filters, this change should yield equal results as the upstream version.

I didn't want to change the functionality drastically not to break existing code, but this trait is actually reinventing the character classes already available on Scala's Char class. These include isLower, isUpper, toLower, isLetter,isDigit`, etc.

I would suggest deprecating this trait and use the following existing alternatives (compatible with the StringTransform type):

They are not equivalent, but they are part of the standard library, and I would argue that they are much more accurate than the existing filters.