mpkorstanje / simmetrics

Similarity or Distance Metrics, e.g. Levenshtein, for Java
Apache License 2.0
41 stars 15 forks source link

Longer Soundex lenghts #2

Closed mpkorstanje closed 9 years ago

mpkorstanje commented 9 years ago

The current implementation of the SoundexSimplifier has a maximum soundex size because the implementation is rather naive. Implementation should be fixed to be less naive.

        // Drop first letter code and remove zeros
        wordStr = wordStr.substring(1).replaceAll("0", "");
        // FIXME: This will not work for all soundex lenghts
        wordStr += "000000000000000000"; /* pad with zeros on right */
        // Add first letter of word and size to taste
        wordStr = firstLetter + "-" + wordStr.substring(0, length - 2);
        return wordStr;
mpkorstanje commented 9 years ago

Done. Soundex works with any length now.