More information required about the different metrics

mpkorstanje / simmetrics

Similarity or Distance Metrics, e.g. Levenshtein, for Java

Apache License 2.0

41 stars 15 forks source link

HI, sorry not really an issue but I have raised a simmetrics question on http://stackoverflow.com/questions/40740577/should-i-use-stringmetric-or-multisetmetric-for-comparing-these-strings-with-sim that I hope you can me help with

Having said that it would be helpful if there was a page that grouped/explained the metrics to allow casual users to have a better stab on using the right algorithm. For example I have only just realized that CosineSimilarity with WhiteSpace tokenizer just treats the words in a sentence as a set ignoring order in sentence, although happily this essentially is what I want it to do

mpkorstanje / simmetrics

More information required about the different metrics #15