mpkorstanje / simmetrics

Similarity or Distance Metrics, e.g. Levenshtein, for Java
Apache License 2.0
41 stars 15 forks source link

Soundex optimization #3

Closed mpkorstanje closed 9 years ago

mpkorstanje commented 9 years ago

This code is slow because it creates a pattern with each call to replaceAll. Should be replaced by a precompiled pattern.

        wordStr = wordStr.replaceAll("[aeiouwh]+", "0");
        wordStr = wordStr.replaceAll("[bpfv]+", "1");
        wordStr = wordStr.replaceAll("[cskgjqxz]+", "2");
        wordStr = wordStr.replaceAll("[dt]+", "3");
        wordStr = wordStr.replaceAll("[l]+", "4");
        wordStr = wordStr.replaceAll("[mn]+", "5");
        wordStr = wordStr.replaceAll("[r]+", "6");
mpkorstanje commented 9 years ago

Fixed in develop.