rm-hull / clustering

Implementation of K-Means, Self-Organising Maps, QT and Hierarchical clustering algorithms, in Clojure.
https://www.destructuring-bind.org/clustering
MIT License
22 stars 4 forks source link

Levenshtein distance runs into stackoverflow error for longer strings #6

Open rbuchmann opened 7 years ago

rbuchmann commented 7 years ago

I used it to compare strings of around 130 characters length. The clj-fuzzy implementation doesn't crash and yields a distance of around 40.

rm-hull commented 7 years ago

Can you provide a simple test case please?

rbuchmann commented 7 years ago

Sure:

(def s "some pretty long string with ")
(def e " uuids in the middle")

(defn t [n]
  (str s (str/join "," (repeatedly n #(str (java.util.UUID/randomUUID)))) e))

(levenshtein/distance (t 2) (t 2))