rth / vtext

Simple NLP in Rust with Python bindings
Apache License 2.0
148 stars 11 forks source link

Add Sørensen-Dice string similarity #38

Closed rth closed 5 years ago

rth commented 5 years ago

Adds Sørensen-Dice string similarity. It tokenizes the strings as 2-char ngrams.

use vtext::metrics::string::dice_similarity;

assert_eq!(dice_similarity("healed", "sealed"), 0.8)